Project Description
To build an algorithm to detect a visual signal for pneumonia in medical images. Specifically, the algorithm needs to automatically locate lung opacities on chest radiographs.
The dataset contains the following files and folders: - stage_2_train_labels.csv - The training set. It contains patientIds and bounding box / target information. - stage_2_detailed_class_info.csv – It provides detailed information about the type of positive or negative class for each image.
Apart from the above-mentioned data files (in csv format), the dataset also contains the images folders - stage_2_train_images - stage_2_test_images
The images in the above-mentioned folders are stored in a special format called DICOM files (*.dcm). They contain a combination of header metadata as well as underlying raw image arrays for pixel data.
The objective of this task is to do Pre-Processing, Data Visualization, EDA which will involve the following tasks to be done: - Exploring the given Data files, classes and images of different classes - Dealing with missing values - Visualization of different classes - Analysis from the visualization of different classes.
# Installing pydicom for medical image dataset
!pip install pydicom
Requirement already satisfied: pydicom in /Users/amol/opt/anaconda3/lib/python3.9/site-packages (2.4.3)
# Importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from glob import glob
import os
from matplotlib.patches import Rectangle
import pydicom
from tqdm import tqdm, tqdm_notebook
from skimage.transform import resize
from skimage import io, measure
import cv2, random
import warnings
warnings.filterwarnings('ignore')
# Reading the labels dataset
train_labels= pd.read_csv('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_labels.csv')
print('First five rows of Training set:\n', train_labels.head())
First five rows of Training set:
patientId x y width height Target
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1
Each row in the CSV file contains a patientId (one unique value per patient), a target (either 0 or 1 for absence or presence of pneumonia, respectively) and the corresponding abnormality bounding box defined by the upper-left hand corner (x, y) coordinate and its corresponding width and height. In this particular case, the patient does not have pneumonia and so the corresponding bounding box information is set to NaN.
patientId - A patientId. Each patientId corresponds to a unique image (which we will see a little bit later)
x - The upper-left x coordinate of the bounding box
y - The upper-left y coordinate of the bounding box width - The width of the bounding box
height - The height of the bounding box
width - The width of the bounding box
Target - The binary Target indicating whether this sample has evidence of pneumonia or not.
# Number of entries in Train label dataframe:
print('The train_label dataframe has {} rows and {} columns.'.format(train_labels.shape[0], train_labels.shape[1]))
The train_label dataframe has 30227 rows and 6 columns.
The train_label dataframe has 30227 rows and 6 columns.
train_labels['patientId'].is_unique
False
# Number of duplicates in patientId:
print('Number of unique patientId are: {}'.format(train_labels['patientId'].nunique()))
Number of unique patientId are: 26684
Thus, the dataset contains information about 26684 patients. Out of these 26684 patients, some of them have multiple entries in the dataset.
print(f'No of entries which has Pneumonia: {train_labels[train_labels.Target == 1].shape[0]} i.e., {round(train_labels[train_labels.Target == 1].shape[0]/train_labels.shape[0]*100, 0)}%')
print(f'No of entries which don\'t have Pneumonia: {train_labels[train_labels.Target == 0].shape[0]} i.e., {round(train_labels[train_labels.Target == 0].shape[0]/train_labels.shape[0]*100, 0)}%')
_ = train_labels['Target'].value_counts().plot(kind = 'pie', autopct = '%.0f%%', labels = ['Negative', 'Positive'], figsize = (10, 6))
No of entries which has Pneumonia: 9555 i.e., 32.0% No of entries which don't have Pneumonia: 20672 i.e., 68.0%
Thus, from the above pie chart it is clear that out of 30227 entries in the dataset, there are 20672 (i.e., 68%) entries in the dataset which corresponds to the entries of the patient Not having Pnuemonia whereas 9555 (i.e., 32%) entries corresponds to Positive case of Pneumonia.
train_labels['Target'].value_counts()
Target 0 20672 1 9555 Name: count, dtype: int64
duplicates = train_labels[train_labels.duplicated(['patientId'])]
duplicates.shape
(3543, 6)
Number of unique patientId are: 26684. Thus, the dataset contains information about 30227 cases. Out of these 26684 patients, some of them have multiple entries in the dataset. There are 3543 duplicate entries.
duplicates.head()
| patientId | x | y | width | height | Target | |
|---|---|---|---|---|---|---|
| 5 | 00436515-870c-4b36-a041-de91049b9ab4 | 562.0 | 152.0 | 256.0 | 453.0 | 1 |
| 9 | 00704310-78a8-4b38-8475-49f4573b2dbb | 695.0 | 575.0 | 162.0 | 137.0 | 1 |
| 15 | 00aecb01-a116-45a2-956c-08d2fa55433f | 547.0 | 299.0 | 119.0 | 165.0 | 1 |
| 17 | 00c0b293-48e7-4e16-ac76-9269ba535a62 | 650.0 | 511.0 | 206.0 | 284.0 | 1 |
| 20 | 00f08de1-517e-4652-a04f-d1dc9ee48593 | 571.0 | 275.0 | 230.0 | 476.0 | 1 |
train_labels[train_labels.patientId=='00436515-870c-4b36-a041-de91049b9ab4']
| patientId | x | y | width | height | Target | |
|---|---|---|---|---|---|---|
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 |
| 5 | 00436515-870c-4b36-a041-de91049b9ab4 | 562.0 | 152.0 | 256.0 | 453.0 | 1 |
train_labels[train_labels.patientId=='00c0b293-48e7-4e16-ac76-9269ba535a62']
| patientId | x | y | width | height | Target | |
|---|---|---|---|---|---|---|
| 16 | 00c0b293-48e7-4e16-ac76-9269ba535a62 | 306.0 | 544.0 | 168.0 | 244.0 | 1 |
| 17 | 00c0b293-48e7-4e16-ac76-9269ba535a62 | 650.0 | 511.0 | 206.0 | 284.0 | 1 |
train_labels[train_labels.patientId=='00aecb01-a116-45a2-956c-08d2fa55433f']
| patientId | x | y | width | height | Target | |
|---|---|---|---|---|---|---|
| 14 | 00aecb01-a116-45a2-956c-08d2fa55433f | 288.0 | 322.0 | 94.0 | 135.0 | 1 |
| 15 | 00aecb01-a116-45a2-956c-08d2fa55433f | 547.0 | 299.0 | 119.0 | 165.0 | 1 |
Checking these of the above patient id which is duplicate , we can see that the x,y, widht and height is not the same. This indicates that the same patient has two bounding boxes in the same dicom image
# dropping duplicates
total_labels = train_labels.drop_duplicates('patientId')
total_labels.shape
(26684, 6)
# Checking nulls in bounding box columns:
print('Number of nulls in bounding box columns: {}'.format(train_labels[['x', 'y', 'width', 'height']].isnull().sum().to_dict()))
Number of nulls in bounding box columns: {'x': 20672, 'y': 20672, 'width': 20672, 'height': 20672}
Thus, we can see that number of nulls in bounding box columns are equal to the number of 0's we have in the Target column.
bounding_box = train_labels.groupby('patientId').size().to_frame('number_of_boxes').reset_index()
train_labels = train_labels.merge(bounding_box, on = 'patientId', how = 'left')
print('Number of patientIds per bounding box in the dataset: ')
(bounding_box.groupby('number_of_boxes').size().to_frame('number_of_patientId').reset_index().set_index('number_of_boxes').sort_values(by = 'number_of_boxes'))
Number of patientIds per bounding box in the dataset:
| number_of_patientId | |
|---|---|
| number_of_boxes | |
| 1 | 23286 |
| 2 | 3266 |
| 3 | 119 |
| 4 | 13 |
train_labels.head()
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | 1 | 1 |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | 1 | 1 |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | 1 | 1 |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | 1 | 1 |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | 2 | 2 |
Thus, there are 23286 unique patients which have only one entry in the dataset. 3266 with 2 bounding box, 119 with 3 bounding box and 13 with 4 bounding box coordinates.
#label_count=train_labels['Target'].value_counts()
label_count=total_labels['Target'].value_counts()
explode = (0.03,0.03)
fig1, ax1 = plt.subplots(figsize=(5,5))
ax1.pie(label_count.values, explode=explode, labels=['Negative', 'Positive'], autopct='%1.1f%%',
shadow=True, startangle=90)
#ax1.axis('equal')
plt.title('Target Distribution')
plt.show()
There are 22.5% of patients with pneumonia and the remaining are no pneumonia. There is a class imbalance issue.
# Reading the class info dataset
class_labels = pd.read_csv('//Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_detailed_class_info.csv')
print('First five rows of Class label dataset are:\n', class_labels.head())
First five rows of Class label dataset are:
patientId class
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 No Lung Opacity / Not Normal
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd No Lung Opacity / Not Normal
2 00322d4d-1c29-4943-afc9-b6754be640eb No Lung Opacity / Not Normal
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 Normal
4 00436515-870c-4b36-a041-de91049b9ab4 Lung Opacity
Some information about the data field present in the 'stage_2_detailed_class_info.csv' are:
patientId - A patientId. Each patientId corresponds to a unique image
class - Have three values depending what is the current state of the patient's lung: 'No Lung Opacity / Not Normal', 'Normal' and 'Lung Opacity'.
# Checking the shape of Class labels
print('The class_label dataframe has {} rows and {} columns.'.format(class_labels.shape[0], class_labels.shape[1]))
The class_label dataframe has 30227 rows and 2 columns.
class_labels['patientId'].is_unique
False
# Number of duplicates in patients:
print('Number of unique patientId are: {}'.format(class_labels['patientId'].nunique()))
Number of unique patientId are: 26684
Same number of duplicates are present in the class info dataset as of label dataset
duplicates_class = class_labels[class_labels.duplicated(['patientId'])]
duplicates_class.shape
(3543, 2)
duplicates_class.head()
| patientId | class | |
|---|---|---|
| 5 | 00436515-870c-4b36-a041-de91049b9ab4 | Lung Opacity |
| 9 | 00704310-78a8-4b38-8475-49f4573b2dbb | Lung Opacity |
| 15 | 00aecb01-a116-45a2-956c-08d2fa55433f | Lung Opacity |
| 17 | 00c0b293-48e7-4e16-ac76-9269ba535a62 | Lung Opacity |
| 20 | 00f08de1-517e-4652-a04f-d1dc9ee48593 | Lung Opacity |
All dulicate records has lung Opacity - pneumonia cases
class_labels[class_labels.patientId=='00436515-870c-4b36-a041-de91049b9ab4']
| patientId | class | |
|---|---|---|
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | Lung Opacity |
| 5 | 00436515-870c-4b36-a041-de91049b9ab4 | Lung Opacity |
class_labels[class_labels.patientId=='00704310-78a8-4b38-8475-49f4573b2dbb']
| patientId | class | |
|---|---|---|
| 8 | 00704310-78a8-4b38-8475-49f4573b2dbb | Lung Opacity |
| 9 | 00704310-78a8-4b38-8475-49f4573b2dbb | Lung Opacity |
class_labels[class_labels.patientId=='00aecb01-a116-45a2-956c-08d2fa55433f']
| patientId | class | |
|---|---|---|
| 14 | 00aecb01-a116-45a2-956c-08d2fa55433f | Lung Opacity |
| 15 | 00aecb01-a116-45a2-956c-08d2fa55433f | Lung Opacity |
def get_feature_distribution(data, feature):
# Count for each label
label_counts = data[feature].value_counts()
# Count the number of items in each class
total_samples = len(data)
print("Feature: {}".format(feature))
for i in range(len(label_counts)):
label = label_counts.index[i]
count = label_counts.values[i]
percent = int((count / total_samples) * 10000) / 100
print("{:<30s}: {} which is {}% of the total data in the dataset".format(label, count, percent))
get_feature_distribution(class_labels, 'class')
Feature: class No Lung Opacity / Not Normal : 11821 which is 39.1% of the total data in the dataset Lung Opacity : 9555 which is 31.61% of the total data in the dataset Normal : 8851 which is 29.28% of the total data in the dataset
figsize = (10, 6)
_ = class_labels['class'].value_counts().sort_index(ascending = False).plot(kind = 'pie', autopct = '%.0f%%').set_ylabel('')
There are 8851 normal cases , person with lung opacity are 9555 and No Lung Opacity / Not Normal are 11821
# Dropping duplicates
total_classes = class_labels.drop_duplicates('patientId')
total_classes.shape
(26684, 2)
#label_count=class_labels['class'].value_counts()
class_count=total_classes['class'].value_counts()
explode = (0.03,0.03,0.03)
fig1, ax1 = plt.subplots(figsize=(5,5))
ax1.pie(class_count.values, explode=explode, labels=class_count.index, autopct='%1.1f%%',
shadow=True, startangle=90)
#ax1.axis('equal')
plt.title('Class Distribution after dropping duplicates')
plt.show()
# Conctinating the two dataset - 'train_labels' and 'class_labels':
training_data = pd.concat([train_labels, class_labels['class']], axis = 1)
training_data.head()
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | 2 | 2 | Lung Opacity |
# Dropping duplicates
training_data_wo_duplicates = training_data.drop_duplicates('patientId')
training_data_wo_duplicates.shape
(26684, 9)
training_data_wo_duplicates
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | 2 | 2 | Lung Opacity |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 30219 | c1e73a4e-7afe-4ec5-8af6-ce8315d7a2f2 | 666.0 | 418.0 | 186.0 | 223.0 | 1 | 2 | 2 | Lung Opacity |
| 30221 | c1ec14ff-f6d7-4b38-b0cb-fe07041cbdc8 | 609.0 | 464.0 | 240.0 | 284.0 | 1 | 2 | 2 | Lung Opacity |
| 30223 | c1edf42b-5958-47ff-a1e7-4f23d99583ba | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal |
| 30224 | c1f6b555-2eb1-4231-98f6-50a963976431 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal |
| 30225 | c1f7889a-9ea9-4acb-b64c-b737c929599a | 570.0 | 393.0 | 261.0 | 345.0 | 1 | 2 | 2 | Lung Opacity |
26684 rows × 9 columns
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
# Assuming training_data_wo_duplicates is already defined and is the cleaned DataFrame
fig, ax = plt.subplots(nrows = 1, figsize = (12, 6))
temp = training_data_wo_duplicates.groupby('Target')['class'].value_counts()
data_target_class = pd.DataFrame(data = {'Values': temp.values}, index = temp.index).reset_index()
# Creating the barplot
sns.barplot(ax = ax, x = 'Target', y = 'Values', hue = 'class', data = data_target_class, palette = 'Set1')
# Adding title
plt.title('Class and Target Distribution')
# Annotating the bars with the value counts
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='center', fontsize=11, color='black', xytext=(0, 5),
textcoords='offset points')
plt.show()
Target = 1 is associated with only class = Lung Opacity whereas Target = 0 is associated with only class = No Lung Opacity / Not Normal as well as Normal
fig, ax = plt.subplots(1, 1, figsize = (7, 7))
target_1 = training_data[training_data['Target'] == 1]
target_sample = target_1.sample(5000)
target_sample['xc'] = target_sample['x'] + target_sample['width'] / 2
target_sample['yc'] = target_sample['y'] + target_sample['height'] / 2
plt.title('Centers of Lung Opacity Rectangles (brown) over rectangles (yellow)\nSample Size: 5000')
target_sample.plot.scatter(x = 'xc', y = 'yc', xlim = (0, 1024), ylim = (0, 1024), ax = ax, alpha = 0.8, marker = '.', color = 'brown')
for i, crt_sample in target_sample.iterrows():
ax.add_patch(Rectangle(xy=(crt_sample['x'], crt_sample['y']),
width=crt_sample['width'],height=crt_sample['height'],alpha=3.5e-3, color="yellow"))
we can see that the centers for the bounding box are spread out evenly across the Lungs. Though a large portion of the bounding box have their centers at the centers of the Lung, but some centers of the box are also located at the edges of lung.
They contain a combination of header metadata as well as underlying raw image arrays for pixel data. We can access and manipulate DICOM files using pydicom module. To use the pydicom, first let us find the DICOM file for a given patientId by simply looking for the matching file in the stage_2_train_images/ folder, and then use the pydicom.read_file() method to load the data:
sample_patientId = train_labels['patientId'][0]
dcm_file = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/' + '{}.dcm'.format(sample_patientId)
dcm_data = pydicom.read_file(dcm_file)
print('Metadata of the image consists of \n', dcm_data)
Metadata of the image consists of Dataset.file_meta ------------------------------- (0002, 0000) File Meta Information Group Length UL: 202 (0002, 0001) File Meta Information Version OB: b'\x00\x01' (0002, 0002) Media Storage SOP Class UID UI: Secondary Capture Image Storage (0002, 0003) Media Storage SOP Instance UID UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526 (0002, 0010) Transfer Syntax UID UI: JPEG Baseline (Process 1) (0002, 0012) Implementation Class UID UI: 1.2.276.0.7230010.3.0.3.6.0 (0002, 0013) Implementation Version Name SH: 'OFFIS_DCMTK_360' ------------------------------------------------- (0008, 0005) Specific Character Set CS: 'ISO_IR 100' (0008, 0016) SOP Class UID UI: Secondary Capture Image Storage (0008, 0018) SOP Instance UID UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526 (0008, 0020) Study Date DA: '19010101' (0008, 0030) Study Time TM: '000000.00' (0008, 0050) Accession Number SH: '' (0008, 0060) Modality CS: 'CR' (0008, 0064) Conversion Type CS: 'WSD' (0008, 0090) Referring Physician's Name PN: '' (0008, 103e) Series Description LO: 'view: PA' (0010, 0010) Patient's Name PN: '0004cfab-14fd-4e49-80ba-63a80b6bddd6' (0010, 0020) Patient ID LO: '0004cfab-14fd-4e49-80ba-63a80b6bddd6' (0010, 0030) Patient's Birth Date DA: '' (0010, 0040) Patient's Sex CS: 'F' (0010, 1010) Patient's Age AS: '51' (0018, 0015) Body Part Examined CS: 'CHEST' (0018, 5101) View Position CS: 'PA' (0020, 000d) Study Instance UID UI: 1.2.276.0.7230010.3.1.2.8323329.28530.1517874485.775525 (0020, 000e) Series Instance UID UI: 1.2.276.0.7230010.3.1.3.8323329.28530.1517874485.775524 (0020, 0010) Study ID SH: '' (0020, 0011) Series Number IS: '1' (0020, 0013) Instance Number IS: '1' (0020, 0020) Patient Orientation CS: '' (0028, 0002) Samples per Pixel US: 1 (0028, 0004) Photometric Interpretation CS: 'MONOCHROME2' (0028, 0010) Rows US: 1024 (0028, 0011) Columns US: 1024 (0028, 0030) Pixel Spacing DS: [0.14300000000000002, 0.14300000000000002] (0028, 0100) Bits Allocated US: 8 (0028, 0101) Bits Stored US: 8 (0028, 0102) High Bit US: 7 (0028, 0103) Pixel Representation US: 0 (0028, 2110) Lossy Image Compression CS: '01' (0028, 2114) Lossy Image Compression Method CS: 'ISO_10918_1' (7fe0, 0010) Pixel Data OB: Array of 142006 elements
From the above sample we can see that dicom file contains some of the information that can be used for further analysis such as sex, age, body part examined (which should be mostly chest), view position and modality. Size of this image is 1024 x 1024 (rows x columns).
Demographic Analysis: Use patient’s age, sex, etc., to perform demographic analysis as part of EDA.
Image Preprocessing: Utilize image-related metadata for normalizing and resizing images before feeding them to the model.
Data Augmentation: View Position can be used to generate different augmented versions of the same image, such as flips or rotations, which might help in improving model robustness.
Time-series Analysis: If multiple studies for the same patient are available, Study Date and Study Time can help in creating time-based features or for tracking the progression of diseases.
print('Number of images in training images folders are: {}.'.format(len(os.listdir('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/'))))
Number of images in training images folders are: 26684.
we can see that in the training images folder we have just 26684 images which is same as that of unique patientId's present in either of the csv files. Thus, we can say that each of the unique patientId's present in either of the csv files corresponds to an image present in the folder.
training_image_path = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/'
# Use the glob function to get the list of all .dcm files in the specified directory and create a DataFrame with this information.
# glob(os.path.join(training_image_path, '*.dcm')) will generate a list of all .dcm files in the specified path.
# pd.DataFrame is used to convert this list into a DataFrame with a single column named 'path'.
# Extract the 'patientId' from the 'path' column.
# For each path in 'path' column, os.path.basename(x) gets the filename with extension (e.g. 'example.dcm') from the full path.
# os.path.splitext(...) then splits the filename into name ('example') and extension ('.dcm'), and [0] extracts the name part which is the 'patientId'.
images = pd.DataFrame({'path': glob(os.path.join(training_image_path, '*.dcm'))})
images['patientId'] = images['path'].map(lambda x:os.path.splitext(os.path.basename(x))[0])
print('Columns in the training images dataframe: {}'.format(list(images.columns)))
Columns in the training images dataframe: ['path', 'patientId']
testing_image_path = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_test_images/'
test_images = pd.DataFrame({'path': glob(os.path.join(testing_image_path, '*.dcm'))})
test_images['patientId'] = test_images['path'].map(lambda x:os.path.splitext(os.path.basename(x))[0])
print('Columns in the testing images dataframe: {}'.format(list(test_images.columns)))
Columns in the testing images dataframe: ['path', 'patientId']
print('Number of images in testing images folders are: {}.'.format(len(os.listdir('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_test_images/'))))
Number of images in testing images folders are: 3000.
test_images.head()
| path | patientId | |
|---|---|---|
| 0 | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 2392af63-9496-4e72-b348-9276432fd797 |
| 1 | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 2ce40417-1531-4101-be24-e85416c812cc |
| 2 | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 2bc0fd91-931a-446f-becb-7a6d3f2a7678 |
| 3 | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 29d42f45-5046-4112-87fa-18ea6ea97e75 |
| 4 | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 208e3daf-18cb-4bf7-8325-0acf318ed62c |
test_images.shape
(3000, 2)
# Merging the images dataframe with training_data dataframe
training_data = training_data.merge(images, on = 'patientId', how = 'left')
print('After merging the two dataframe, the training_data has {} rows and {} columns.'.format(training_data.shape[0], training_data.shape[1]))
print('\nColumns in the training images dataframe: {}'.format(list(training_data.columns)))
After merging the two dataframe, the training_data has 30227 rows and 10 columns. Columns in the training images dataframe: ['patientId', 'x', 'y', 'width', 'height', 'Target', 'number_of_boxes_x', 'number_of_boxes_y', 'class', 'path']
print('The training_data dataframe as of now stands like\n')
training_data.head()
The training_data dataframe as of now stands like
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | 2 | 2 | Lung Opacity | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
# Merging the test_images dataframe with training_data dataframe
testing_data = training_data.merge(test_images, on = 'patientId', how = 'right')
print('After merging the two dataframe, the testing_data has {} rows and {} columns.'.format(testing_data.shape[0], testing_data.shape[1]))
After merging the two dataframe, the testing_data has 3000 rows and 11 columns.
testing_data.head()
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path_x | path_y | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2392af63-9496-4e72-b348-9276432fd797 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 1 | 2ce40417-1531-4101-be24-e85416c812cc | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 2 | 2bc0fd91-931a-446f-becb-7a6d3f2a7678 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 3 | 29d42f45-5046-4112-87fa-18ea6ea97e75 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 4 | 208e3daf-18cb-4bf7-8325-0acf318ed62c | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
testing_data=testing_data.drop(['path_x'], axis=1)
testing_data=testing_data.rename(columns={'path_y':'path'})
testing_data.head()
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2392af63-9496-4e72-b348-9276432fd797 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 1 | 2ce40417-1531-4101-be24-e85416c812cc | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 2 | 2bc0fd91-931a-446f-becb-7a6d3f2a7678 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 3 | 29d42f45-5046-4112-87fa-18ea6ea97e75 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
| 4 | 208e3daf-18cb-4bf7-8325-0acf318ed62c | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
Now we have both training and test data properly labelled
testing_data[testing_data['patientId']=='0000a175-0e68-4ca4-b1af-167204a7e0bc']
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1234 | 0000a175-0e68-4ca4-b1af-167204a7e0bc | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
training_data[training_data['patientId']=='0000a175-0e68-4ca4-b1af-167204a7e0bc']
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path |
|---|
testing_data[testing_data['patientId']=='c1e88810-9e4e-4f39-9306-8d314bfc1ff1']
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path | |
|---|---|---|---|---|---|---|---|---|---|---|
| 2551 | c1e88810-9e4e-4f39-9306-8d314bfc1ff1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | /Users/amol/Downloads/AIMLProjects/Capstone/GL... |
training_data[training_data['patientId']=='c1e88810-9e4e-4f39-9306-8d314bfc1ff1']
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path |
|---|
From above, we can say for sure that 3000 images in stage_2_test_images folder are not present in training images folder (stage_2_train_images) and hence, we don't have classes information for testing images
columns_to_add = ['PatientAge', 'PatientSex']
def parse_dicom_data(data_df, data_path):
for col in columns_to_add:
data_df[col] = None
image_names = os.listdir('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/')
for i, img_name in tqdm_notebook(enumerate(image_names)):
imagepath = os.path.join('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/', img_name)
data_img = pydicom.read_file(imagepath)
idx = (data_df['patientId'] == data_img.PatientID)
data_df.loc[idx, 'PatientAge'] = pd.to_numeric(data_img.PatientAge)
data_df.loc[idx, 'PatientSex'] = data_img.PatientSex
parse_dicom_data(training_data, '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/')
0it [00:00, ?it/s]
print('After parsing the information from the dicom images, our training_data dataframe has {} rows and {} columns and it looks like:\n'.format(training_data.shape[0], training_data.shape[1]))
training_data.head()
After parsing the information from the dicom images, our training_data dataframe has 30227 rows and 12 columns and it looks like:
| patientId | x | y | width | height | Target | number_of_boxes_x | number_of_boxes_y | class | path | PatientAge | PatientSex | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 51 | F |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 48 | F |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | 1 | 1 | No Lung Opacity / Not Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 19 | M |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | 1 | 1 | Normal | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 28 | M |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | 2 | 2 | Lung Opacity | /Users/amol/Downloads/AIMLProjects/Capstone/GL... | 32 | F |
# Saving the training_data for further use:
training_data.to_pickle('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/training_data.pkl')
# Loading the training dataset from pickled file above
import pickle
file_path= '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/training_data.pkl'
training_data = pickle.load(open(file_path, "rb"))
# Dropping duplicates
training_data_wo_duplicates = training_data.drop_duplicates('patientId')
# PatientSex_count=training_data['PatientSex'].value_counts()
PatientSex_count=training_data_wo_duplicates['PatientSex'].value_counts()
explode = (0.03,0.03)
fig1, ax1 = plt.subplots(figsize=(5,5))
ax1.pie(PatientSex_count.values, explode=explode, labels=PatientSex_count.index, autopct='%1.1f%%',
shadow=True, startangle=90)
#ax1.axis('equal')
plt.title('PatientSex Distribution')
plt.show()
import seaborn as sns
import matplotlib.pyplot as plt
# Distbution of PatientSex Among the targets
fig, ax = plt.subplots(nrows = 1, figsize = (12, 6))
temp = training_data_wo_duplicates.groupby('Target')['PatientSex'].value_counts()
data_target_class = pd.DataFrame(data = {'Values': temp.values}, index = temp.index).reset_index()
sns.barplot(ax = ax, x = 'Target', y = 'Values', hue = 'PatientSex', data = data_target_class, palette = 'Set1')
# Adding the count above the bars
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='center', fontsize=11, color='black', xytext=(0, 5),
textcoords='offset points')
plt.title('PatientSex vs Target')
plt.show()
# Distribution of Sex Among the classes
fig, ax = plt.subplots(nrows = 1, figsize = (12, 6))
temp = training_data_wo_duplicates.groupby('class')['PatientSex'].value_counts()
data_target_class = pd.DataFrame(data = {'Values': temp.values}, index = temp.index).reset_index()
sns.barplot(ax = ax, x = 'class', y = 'Values', hue = 'PatientSex', data = data_target_class, palette = 'Set1')
# Adding the count above the bars
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='center', fontsize=11, color='black', xytext=(0, 5),
textcoords='offset points')
plt.title('Class vs PatientSex')
plt.show()
# Distribution of Sex Among the classes
fig, ax = plt.subplots(nrows = 1, figsize = (12, 6))
temp = training_data_wo_duplicates.groupby('PatientSex')['class'].value_counts()
data_target_class = pd.DataFrame(data = {'Values': temp.values}, index = temp.index).reset_index()
sns.barplot(ax = ax, x = 'PatientSex', y = 'Values', hue = 'class', data = data_target_class, palette = 'Set1')
# Adding the count above the bars
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='center', fontsize=11, color='black', xytext=(0, 5),
textcoords='offset points')
plt.title('PatientSex vs Class')
plt.show()
data_target_class
| PatientSex | class | Values | |
|---|---|---|---|
| 0 | F | No Lung Opacity / Not Normal | 5111 |
| 1 | F | Normal | 3905 |
| 2 | F | Lung Opacity | 2502 |
| 3 | M | No Lung Opacity / Not Normal | 6710 |
| 4 | M | Normal | 4946 |
| 5 | M | Lung Opacity | 3510 |
training_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 30227 entries, 0 to 30226 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 patientId 30227 non-null object 1 x 9555 non-null float64 2 y 9555 non-null float64 3 width 9555 non-null float64 4 height 9555 non-null float64 5 Target 30227 non-null int64 6 number_of_boxes 30227 non-null int64 7 class 30227 non-null object 8 path 30227 non-null object 9 PatientAge 30227 non-null object 10 PatientSex 30227 non-null object dtypes: float64(4), int64(2), object(5) memory usage: 2.5+ MB
training_data['PatientAge'] = training_data.PatientAge.astype(int)
training_data_wo_duplicates.info()
<class 'pandas.core.frame.DataFrame'> Index: 26684 entries, 0 to 30225 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 patientId 26684 non-null object 1 x 6012 non-null float64 2 y 6012 non-null float64 3 width 6012 non-null float64 4 height 6012 non-null float64 5 Target 26684 non-null int64 6 number_of_boxes 26684 non-null int64 7 class 26684 non-null object 8 path 26684 non-null object 9 PatientAge 26684 non-null object 10 PatientSex 26684 non-null object dtypes: float64(4), int64(2), object(5) memory usage: 2.4+ MB
training_data_wo_duplicates['PatientAge'] = training_data_wo_duplicates.PatientAge.astype(int)
sns.distplot(training_data_wo_duplicates.PatientAge)
<Axes: xlabel='PatientAge', ylabel='Density'>
We can see that most of the patient age falls between 50 to 60 years.
plt.figure(figsize=(10,5))
ax = sns.barplot(x='class', y='PatientAge', data=training_data_wo_duplicates, palette='Set1')
We can see that most of the patients with Pneumonia are aged between 40-50 years age.
plt.figure(figsize=(10,5))
ax = sns.barplot(x='Target', y='PatientAge', data=training_data_wo_duplicates, palette='Set1')
# Function to read DCM images and showing the images along with metadata
def show_dicom_images(data, df, img_path):
img_data = list(data.T.to_dict().values())
f, ax = plt.subplots(3, 3, figsize = (16, 18))
for i, row in enumerate(img_data):
image = row['patientId'] + '.dcm'
path = os.path.join(img_path, image)
data = pydicom.read_file(path)
rows = df[df['patientId'] == row['patientId']]
age = rows.PatientAge.unique().tolist()[0]
sex = data.PatientSex
data_img = pydicom.dcmread(path)
ax[i//3, i%3].imshow(data_img.pixel_array, cmap = plt.cm.bone)
ax[i//3, i%3].axis('off')
ax[i//3, i%3].set_title('ID: {}\nAge: {}, Sex: {}, \nTarget: {}, Class: {}\nWindow: {}:{}:{}:{}'\
.format(row['patientId'], age, sex, row['Target'],
row['class'], row['x'],
row['y'], row['width'],
row['height']))
box_data = list(rows.T.to_dict().values())
for j, row in enumerate(box_data):
ax[i//3, i%3].add_patch(Rectangle(xy = (row['x'], row['y']),
width = row['width'], height = row['height'],
edgecolor = 'r', linewidth = 2, facecolor = 'none'))
plt.show()
show_dicom_images(data = training_data.loc[(training_data['Target'] == 0)].sample(9),
df = training_data, img_path = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images')
show_dicom_images(data = training_data.loc[(training_data['Target'] == 1)].sample(9),
df = training_data, img_path = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images')
The training dataset (both of the csv files and the training image folder) contains information of 26684 patients (unique)
Out of these 26684 unique patients some of these have multiple entries in the both of the csv files Most of the recorded patient belong to Target = 0 (i.e., they don't have Pneumonia)
The classes "No Lung Opacity / Not Normal" and "Normal" is associated with Target = 0 whereas "Lung Opacity" belong to Target = 1
The images are present in dicom format, from which information like PatientAge, PatientSex etc are obtained
The centers of the bounding box are spread out over the entire region of the lungs.
## Just taking a 500 samples from the dataset
sample_trainigdata = training_data.groupby('class', group_keys=False).apply(lambda x: x.sample(500))
## Checking the training data set with class distbution
sample_trainigdata["class"].value_counts()
class Lung Opacity 500 No Lung Opacity / Not Normal 500 Normal 500 Name: count, dtype: int64
## Pre Processing the image
from tensorflow.keras.applications.mobilenet import preprocess_input
images = []
ADJUSTED_IMAGE_SIZE = 256
imageList = []
classLabels = []
labels = []
originalImage = []
# Function to read the image from the path and reshape the image to size
def readAndReshapeImage(image):
img = np.array(image).astype(np.uint8)
## Resize the image
res = cv2.resize(img,(ADJUSTED_IMAGE_SIZE,ADJUSTED_IMAGE_SIZE), interpolation = cv2.INTER_LINEAR)
return res
## Reading the images and resizing the images
def populateImage(rowData):
for index, row in rowData.iterrows():
patientId = row.patientId
classlabel = row["class"]
dcm_file = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/' + '{}.dcm'.format(patientId)
dcm_data = pydicom.read_file(dcm_file)
img = dcm_data.pixel_array
## Converting the image to 3 channels as the dicom image pixel does not have colour classes with it
if len(img.shape) != 3 or img.shape[2] != 3:
img = np.stack((img,) * 3, -1)
imageList.append(readAndReshapeImage(img))
originalImage.append(img)
classLabels.append(classlabel)
tmpImages = np.array(imageList)
tmpLabels = np.array(classLabels)
originalImages = np.array(originalImage)
return tmpImages,tmpLabels,originalImages
2023-10-25 22:10:51.145139: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
# Reading the images into numpy array
images,labels,oringianlImage = populateImage(sample_trainigdata)
# Checking image shape
images.shape , labels.shape
((1500, 256, 256, 3), (1500,))
The image is of 256 X 256 with 3 channels
import random
sample_indices = random.sample(range(images.shape[0]), 5) # 5 random indices
for idx in sample_indices:
plt.imshow(images[idx])
plt.title(labels[idx])
plt.show()
# Importing libraries
from sklearn.preprocessing import LabelEncoder
import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras import losses,optimizers
from tensorflow.keras.layers import Dense, Activation, Flatten,Dropout,MaxPooling2D,BatchNormalization
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense ,LeakyReLU
from tensorflow.keras import regularizers, optimizers
from sklearn.metrics import r2_score
from tensorflow.keras.models import load_model
from keras.layers import Conv2D # swipe across the image by 1
from keras.layers import MaxPooling2D # swipe across by pool size
from keras.layers import Flatten, GlobalAveragePooling2D,GlobalMaxPooling2D
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.optimizers import Adam
# encoding the labels
from sklearn.preprocessing import LabelBinarizer
enc = LabelBinarizer()
y2 = enc.fit_transform(labels)
print(enc.classes_)
# Count of each label after encoding
label_counts = np.sum(y2, axis=0)
# Print the counts
for class_name, count in zip(enc.classes_, label_counts):
print(f"{class_name}: {count}")
['Lung Opacity' 'No Lung Opacity / Not Normal' 'Normal'] Lung Opacity: 500 No Lung Opacity / Not Normal: 500 Normal: 500
# splitting into train ,test and validation data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(images, y2, test_size=0.3, random_state=42)
X_test, X_val, y_test, y_val = train_test_split(X_test,y_test, test_size = 0.5, random_state=42)
# encoding the labels
from sklearn.preprocessing import LabelBinarizer
import numpy as np
enc = LabelBinarizer()
y2 = enc.fit_transform(labels)
# Print the classes
print("Classes:", enc.classes_)
# Count of each label after encoding
label_counts_y2 = np.sum(y2, axis=0)
print("\nCounts in y2:")
for class_name, count in zip(enc.classes_, label_counts_y2):
print(f"{class_name}: {count}")
# Count of each label in y_val
label_counts_y_val = np.sum(y_val, axis=0)
print("\nCounts in y_val:")
for class_name, count in zip(enc.classes_, label_counts_y_val):
print(f"{class_name}: {count}")
# Count of each label in y_test
label_counts_y_test = np.sum(y_test, axis=0)
print("\nCounts in y_test:")
for class_name, count in zip(enc.classes_, label_counts_y_test):
print(f"{class_name}: {count}")
Classes: ['Lung Opacity' 'No Lung Opacity / Not Normal' 'Normal'] Counts in y2: Lung Opacity: 500 No Lung Opacity / Not Normal: 500 Normal: 500 Counts in y_val: Lung Opacity: 71 No Lung Opacity / Not Normal: 74 Normal: 80 Counts in y_test: Lung Opacity: 77 No Lung Opacity / Not Normal: 67 Normal: 81
# Function to create a dataframe for results
def createResultDf(name,accuracy,testscore):
result = pd.DataFrame({'Method':[name], 'accuracy': [accuracy] ,'Test Score':[testscore]})
return result
# CNN Model without transfer learning , we start with 32 filters with 5,5 kernal and no padding , then 64 and 128 wiht drop layers in between
# And softmax activaation as the last layer
def cnn_model(height, width, num_channels, num_classes, loss='categorical_crossentropy', metrics=['accuracy']):
batch_size = None
model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same',
activation ='relu', batch_input_shape = (batch_size,height, width, num_channels)))
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same',
activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same',
activation ='relu'))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',
activation ='relu'))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',
activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.4))
model.add(GlobalMaxPooling2D())
model.add(Dense(256, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation = "softmax"))
optimizer = RMSprop(learning_rate=0.001, rho=0.9, epsilon=1e-08)
model.compile(optimizer = optimizer, loss = loss, metrics = metrics)
model.summary()
return model
# Model Summary
cnn = cnn_model(ADJUSTED_IMAGE_SIZE,ADJUSTED_IMAGE_SIZE,3,3)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 256, 256, 32) 2432
conv2d_1 (Conv2D) (None, 256, 256, 32) 25632
max_pooling2d (MaxPooling2 (None, 128, 128, 32) 0
D)
dropout (Dropout) (None, 128, 128, 32) 0
conv2d_2 (Conv2D) (None, 128, 128, 64) 18496
conv2d_3 (Conv2D) (None, 128, 128, 64) 36928
max_pooling2d_1 (MaxPoolin (None, 64, 64, 64) 0
g2D)
dropout_1 (Dropout) (None, 64, 64, 64) 0
conv2d_4 (Conv2D) (None, 64, 64, 128) 73856
conv2d_5 (Conv2D) (None, 64, 64, 128) 147584
max_pooling2d_2 (MaxPoolin (None, 32, 32, 128) 0
g2D)
dropout_2 (Dropout) (None, 32, 32, 128) 0
global_max_pooling2d (Glob (None, 128) 0
alMaxPooling2D)
dense (Dense) (None, 256) 33024
dropout_3 (Dropout) (None, 256) 0
dense_1 (Dense) (None, 3) 771
=================================================================
Total params: 338723 (1.29 MB)
Trainable params: 338723 (1.29 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
# Training for 30 epocs with batch size of 35
history = cnn.fit(X_train,
y_train,
epochs = 30,
validation_data = (X_val,y_val),
batch_size = 30)
Epoch 1/30 35/35 [==============================] - 143s 4s/step - loss: 6.2805 - accuracy: 0.3267 - val_loss: 1.0944 - val_accuracy: 0.3333 Epoch 2/30 35/35 [==============================] - 142s 4s/step - loss: 1.1342 - accuracy: 0.3524 - val_loss: 1.0919 - val_accuracy: 0.3467 Epoch 3/30 35/35 [==============================] - 144s 4s/step - loss: 1.1259 - accuracy: 0.3590 - val_loss: 1.0900 - val_accuracy: 0.3911 Epoch 4/30 35/35 [==============================] - 149s 4s/step - loss: 1.1151 - accuracy: 0.3590 - val_loss: 1.0892 - val_accuracy: 0.4000 Epoch 5/30 35/35 [==============================] - 145s 4s/step - loss: 1.1019 - accuracy: 0.3990 - val_loss: 1.0848 - val_accuracy: 0.3333 Epoch 6/30 35/35 [==============================] - 145s 4s/step - loss: 1.0917 - accuracy: 0.3505 - val_loss: 1.1103 - val_accuracy: 0.3467 Epoch 7/30 35/35 [==============================] - 149s 4s/step - loss: 1.1168 - accuracy: 0.3876 - val_loss: 1.0867 - val_accuracy: 0.4000 Epoch 8/30 35/35 [==============================] - 145s 4s/step - loss: 1.0804 - accuracy: 0.3886 - val_loss: 1.0664 - val_accuracy: 0.4489 Epoch 9/30 35/35 [==============================] - 154s 4s/step - loss: 1.0746 - accuracy: 0.4219 - val_loss: 1.0651 - val_accuracy: 0.3778 Epoch 10/30 35/35 [==============================] - 155s 4s/step - loss: 1.0614 - accuracy: 0.4229 - val_loss: 1.0727 - val_accuracy: 0.4133 Epoch 11/30 35/35 [==============================] - 150s 4s/step - loss: 1.0652 - accuracy: 0.4190 - val_loss: 1.0463 - val_accuracy: 0.4311 Epoch 12/30 35/35 [==============================] - 171s 5s/step - loss: 1.0750 - accuracy: 0.4095 - val_loss: 1.0475 - val_accuracy: 0.4933 Epoch 13/30 35/35 [==============================] - 168s 5s/step - loss: 1.0523 - accuracy: 0.4495 - val_loss: 1.0489 - val_accuracy: 0.4889 Epoch 14/30 35/35 [==============================] - 151s 4s/step - loss: 1.0526 - accuracy: 0.4610 - val_loss: 1.0497 - val_accuracy: 0.4667 Epoch 15/30 35/35 [==============================] - 149s 4s/step - loss: 1.0415 - accuracy: 0.4695 - val_loss: 1.0554 - val_accuracy: 0.4311 Epoch 16/30 35/35 [==============================] - 151s 4s/step - loss: 1.0366 - accuracy: 0.4686 - val_loss: 1.0639 - val_accuracy: 0.4311 Epoch 17/30 35/35 [==============================] - 166s 5s/step - loss: 1.0273 - accuracy: 0.4638 - val_loss: 1.0404 - val_accuracy: 0.4622 Epoch 18/30 35/35 [==============================] - 163s 5s/step - loss: 1.0179 - accuracy: 0.4743 - val_loss: 1.0533 - val_accuracy: 0.4489 Epoch 19/30 35/35 [==============================] - 158s 5s/step - loss: 1.0301 - accuracy: 0.4495 - val_loss: 1.0327 - val_accuracy: 0.4489 Epoch 20/30 35/35 [==============================] - 165s 5s/step - loss: 1.0166 - accuracy: 0.5086 - val_loss: 1.0326 - val_accuracy: 0.4844 Epoch 21/30 35/35 [==============================] - 151s 4s/step - loss: 1.0092 - accuracy: 0.5038 - val_loss: 1.0494 - val_accuracy: 0.4444 Epoch 22/30 35/35 [==============================] - 149s 4s/step - loss: 0.9923 - accuracy: 0.5019 - val_loss: 1.0649 - val_accuracy: 0.4222 Epoch 23/30 35/35 [==============================] - 149s 4s/step - loss: 1.0052 - accuracy: 0.4857 - val_loss: 1.0655 - val_accuracy: 0.3600 Epoch 24/30 35/35 [==============================] - 149s 4s/step - loss: 0.9910 - accuracy: 0.4810 - val_loss: 1.0341 - val_accuracy: 0.4667 Epoch 25/30 35/35 [==============================] - 147s 4s/step - loss: 0.9781 - accuracy: 0.5257 - val_loss: 1.1103 - val_accuracy: 0.4356 Epoch 26/30 35/35 [==============================] - 146s 4s/step - loss: 0.9740 - accuracy: 0.5371 - val_loss: 1.0846 - val_accuracy: 0.3956 Epoch 27/30 35/35 [==============================] - 150s 4s/step - loss: 0.9760 - accuracy: 0.5095 - val_loss: 1.0617 - val_accuracy: 0.4578 Epoch 28/30 35/35 [==============================] - 145s 4s/step - loss: 0.9611 - accuracy: 0.5048 - val_loss: 1.0566 - val_accuracy: 0.4356 Epoch 29/30 35/35 [==============================] - 149s 4s/step - loss: 0.9505 - accuracy: 0.5324 - val_loss: 1.0987 - val_accuracy: 0.4667 Epoch 30/30 35/35 [==============================] - 150s 4s/step - loss: 0.9448 - accuracy: 0.5314 - val_loss: 1.1197 - val_accuracy: 0.4400
# evalualing the accuracy.
fcl_loss, fcl_accuracy = cnn.evaluate(X_test, y_test, verbose=1)
print('Test loss:', fcl_loss)
print('Test accuracy:', fcl_accuracy)
8/8 [==============================] - 6s 722ms/step - loss: 1.0676 - accuracy: 0.4667 Test loss: 1.0675538778305054 Test accuracy: 0.46666666865348816
# Plottting the accuracy vs loss graph
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(30)
plt.figure(figsize=(15, 15))
plt.subplot(2, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(2, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
resultDF = createResultDf("CNN",acc[-1],fcl_accuracy)
import numpy as np
import matplotlib.pyplot as plt
import itertools
from sklearn.metrics import confusion_matrix, classification_report
# Function to create a dataframe for results
#def createResultDf(name, accuracy, testscore):
# result = pd.DataFrame({'Method': [name], 'accuracy': [accuracy], 'Test Score': [testscore]})
# return result
# Confusion matrix plotting function
def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Assuming acc and fcl_accuracy are correctly defined elsewhere
resultDF = createResultDf("CNN", acc[-1], fcl_accuracy)
# Predict the values from the validation dataset
Y_pred = cnn.predict(X_test)
# Convert predictions classes to one hot vectors
Y_pred_classes = np.argmax(Y_pred, axis=1)
# Convert validation observations to one hot vectors
Y_true = np.argmax(y_test, axis=1)
# Compute the confusion matrix
confusion_mtx = confusion_matrix(Y_true, Y_pred_classes)
# Class names for the confusion matrix
class_names = ['Lung Opacity', 'No Lung Opacity/Not Normal', 'Normal']
# Plot the confusion matrix
plt.subplots(figsize=(22, 7))
plot_confusion_matrix(confusion_mtx, classes=class_names, normalize=False)
plt.show()
# Print the classification report
print(classification_report(Y_true, Y_pred_classes, target_names=class_names))
8/8 [==============================] - 6s 725ms/step
precision recall f1-score support
Lung Opacity 0.53 0.55 0.54 77
No Lung Opacity/Not Normal 0.24 0.16 0.20 67
Normal 0.51 0.64 0.57 81
accuracy 0.47 225
macro avg 0.43 0.45 0.44 225
weighted avg 0.44 0.47 0.45 225
Inferences
The output of the training process you provided suggests that the model has started learning effectively. It is slightly going towards overfitting.
By the 30th epoch, the model achieves a training accuracy of 53.90% and a validation accuracy of 42.22%. This suggests that there's room for improvement, either by changing the model architecture, adjusting hyperparameters, using data augmentation, or gathering more data.
Overall accuracy seems to be hovering around ~40-45% on d validation datasets, and the loss values are not showing significant improvements.
Several potential reasons could explain this:
Random Initialization: Sometimes the model might need a few runs with different random weight initializations to start learning effectively.
Learning Rate: The learning rate for the RMSprop optimizer is set to 0.001. This may be too high or too low, causing the model not to converge. We can try adjusting learning rate scheduler.
Model Complexity: It could be that the model is too complex or too simple for the task. Given the architecture, it doesn't seem to be the case, but it's something to keep in mind.
Dataset Balance: If the dataset is imbalanced (i.e., one class has many more samples than the other), the model might have difficulty learning. Make sure your dataset has a relatively balanced number of samples for each class or consider using techniques like oversampling, undersampling, or using class weights.
Data Augmentation: Implementing data augmentation can introduce variability into the dataset, potentially aiding the model in generalizing better.
Batch Size: The batch size can significantly impact the learning dynamics. You might want to experiment with smaller or larger batch sizes.
Dropout Rate: High dropout rates can sometimes hinder learning, especially in the initial phases. Consider lowering the dropout rates temporarily to see if the model starts learning.
Let's try to fine tune our basic CNN model first
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, Dense
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
from tensorflow.keras.regularizers import l2
def cnn_model_multiclass(height, width, num_channels, loss='categorical_crossentropy', metrics=['accuracy']):
model = Sequential()
reg_strength = 0.0001
model.add(Conv2D(filters=32, kernel_size=(5,5), padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength),
input_shape=(height, width, num_channels)))
model.add(Conv2D(filters=32, kernel_size=(5,5),padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=(3,3),padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength)))
model.add(Conv2D(filters=64, kernel_size=(3,3),padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=128, kernel_size=(3,3),padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength)))
model.add(Conv2D(filters=128, kernel_size=(3,3),padding='Same',
activation='relu', kernel_regularizer=l2(reg_strength)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.4))
model.add(GlobalMaxPooling2D())
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(3, activation="softmax")) # Updated to 3 neurons with softmax activation
optimizer = RMSprop(learning_rate=0.001, rho=0.9, epsilon=1e-08)
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
return model
# Callbacks
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=1e-6)
early_stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1, restore_best_weights=True)
checkpoint = ModelCheckpoint("best_weights.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [reduce_lr, early_stop, checkpoint]
ADJUSTED_IMAGE_SIZE = 256
cnn_multiclass = cnn_model_multiclass(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)
history = cnn_multiclass.fit(X_train,
y_train,
epochs=10,
validation_data=(X_val, y_val),
batch_size=30,
callbacks=callbacks_list)
Epoch 1/10 35/35 [==============================] - ETA: 0s - loss: 7.7941 - accuracy: 0.3390 Epoch 1: val_accuracy improved from -inf to 0.38222, saving model to best_weights.h5 35/35 [==============================] - 146s 4s/step - loss: 7.7941 - accuracy: 0.3390 - val_loss: 1.1240 - val_accuracy: 0.3822 - lr: 0.0010 Epoch 2/10 35/35 [==============================] - ETA: 0s - loss: 1.1663 - accuracy: 0.3600 Epoch 2: val_accuracy did not improve from 0.38222 35/35 [==============================] - 143s 4s/step - loss: 1.1663 - accuracy: 0.3600 - val_loss: 1.1265 - val_accuracy: 0.3644 - lr: 0.0010 Epoch 3/10 35/35 [==============================] - ETA: 0s - loss: 1.1532 - accuracy: 0.3848 Epoch 3: val_accuracy did not improve from 0.38222 35/35 [==============================] - 152s 4s/step - loss: 1.1532 - accuracy: 0.3848 - val_loss: 1.1281 - val_accuracy: 0.3422 - lr: 0.0010 Epoch 4/10 35/35 [==============================] - ETA: 0s - loss: 1.1444 - accuracy: 0.3762 Epoch 4: val_accuracy did not improve from 0.38222 35/35 [==============================] - 153s 4s/step - loss: 1.1444 - accuracy: 0.3762 - val_loss: 1.1239 - val_accuracy: 0.3289 - lr: 0.0010 Epoch 5/10 35/35 [==============================] - ETA: 0s - loss: 1.1329 - accuracy: 0.3848 Epoch 5: val_accuracy improved from 0.38222 to 0.40444, saving model to best_weights.h5 35/35 [==============================] - 172s 5s/step - loss: 1.1329 - accuracy: 0.3848 - val_loss: 1.1105 - val_accuracy: 0.4044 - lr: 0.0010 Epoch 6/10 35/35 [==============================] - ETA: 0s - loss: 1.1274 - accuracy: 0.3590 Epoch 6: val_accuracy improved from 0.40444 to 0.45333, saving model to best_weights.h5 35/35 [==============================] - 176s 5s/step - loss: 1.1274 - accuracy: 0.3590 - val_loss: 1.1117 - val_accuracy: 0.4533 - lr: 0.0010 Epoch 7/10 35/35 [==============================] - ETA: 0s - loss: 1.1665 - accuracy: 0.4133 Epoch 7: val_accuracy did not improve from 0.45333 35/35 [==============================] - 165s 5s/step - loss: 1.1665 - accuracy: 0.4133 - val_loss: 1.1000 - val_accuracy: 0.3778 - lr: 0.0010 Epoch 8/10 35/35 [==============================] - ETA: 0s - loss: 1.1036 - accuracy: 0.4267 Epoch 8: val_accuracy improved from 0.45333 to 0.46667, saving model to best_weights.h5 35/35 [==============================] - 163s 5s/step - loss: 1.1036 - accuracy: 0.4267 - val_loss: 1.0579 - val_accuracy: 0.4667 - lr: 0.0010 Epoch 9/10 35/35 [==============================] - ETA: 0s - loss: 1.1079 - accuracy: 0.4200 Epoch 9: val_accuracy did not improve from 0.46667 35/35 [==============================] - 155s 4s/step - loss: 1.1079 - accuracy: 0.4200 - val_loss: 1.0946 - val_accuracy: 0.3378 - lr: 0.0010 Epoch 10/10 35/35 [==============================] - ETA: 0s - loss: 1.0961 - accuracy: 0.4314 Epoch 10: val_accuracy improved from 0.46667 to 0.47111, saving model to best_weights.h5 35/35 [==============================] - 159s 5s/step - loss: 1.0961 - accuracy: 0.4314 - val_loss: 1.0853 - val_accuracy: 0.4711 - lr: 0.0010
# evalualing the accuracy.
fcl_loss, fcl_accuracy = cnn_multiclass.evaluate(X_test, y_test, verbose=1)
print('Test loss:', fcl_loss)
print('Test accuracy:', fcl_accuracy)
8/8 [==============================] - 6s 718ms/step - loss: 1.0912 - accuracy: 0.5067 Test loss: 1.0912171602249146 Test accuracy: 0.5066666603088379
resultDF = createResultDf("CNN",acc[-1],fcl_accuracy)
from sklearn.metrics import confusion_matrix
import itertools
plt.subplots(figsize=(22,7)) #set the size of the plot
def plot_confusion_matrix(cm, classes,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Blues):
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Predict the values from the validation dataset
Y_pred = cnn_multiclass.predict(X_test)
# Convert predictions classes to one hot vectors
Y_pred_classes = np.argmax(Y_pred,axis = 1)
# Convert validation observations to one hot vectors
Y_true = np.argmax(y_test,axis = 1)
# compute the confusion matrix
confusion_mtx = confusion_matrix(Y_true, Y_pred_classes)
# plot the confusion matrix
plot_confusion_matrix(confusion_mtx, classes = range(3))
plt.show()
# Print the classification report
print(classification_report(Y_true, Y_pred_classes, target_names=class_names))
8/8 [==============================] - 6s 773ms/step
precision recall f1-score support
Lung Opacity 0.48 0.70 0.57 77
No Lung Opacity/Not Normal 0.37 0.16 0.23 67
Normal 0.60 0.60 0.60 81
accuracy 0.51 225
macro avg 0.48 0.49 0.47 225
weighted avg 0.49 0.51 0.48 225
With our basic CNN fine tuning , we were able to increase test acccuracy from 39.5% to 45.3%. It is significatn improvment. Now we will try few pre-trained models
import os
import numpy as np
import pandas as pd
import pydicom
import cv2
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from keras.optimizers import Adam
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.regularizers import l2
import matplotlib.pyplot as plt
# Sample a subset of the data
sample_trainingdata = training_data.groupby('class', group_keys=False).apply(lambda x: x.sample(1200))
# Preprocess DICOM images
ADJUSTED_IMAGE_SIZE = 256
def read_and_reshape_image(image):
img = np.array(image).astype(np.uint8)
res = cv2.resize(img, (ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE), interpolation=cv2.INTER_LINEAR)
return res
def populate_image(data):
images = []
labels = []
for index, row in data.iterrows():
patientId = row.patientId
classlabel = row["class"]
dcm_file = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/' + '{}.dcm'.format(patientId)
dcm_data = pydicom.read_file(dcm_file)
img = dcm_data.pixel_array
if len(img.shape) != 3 or img.shape[2] != 3:
img = np.stack((img,) * 3, -1)
images.append(read_and_reshape_image(img))
labels.append(classlabel)
images = np.array(images)
labels = np.array(labels)
return images, labels
images, labels = populate_image(sample_trainingdata)
# Encode the labels
enc = LabelBinarizer()
encoded_labels = enc.fit_transform(labels)
# Split the data
X_train, X_validate, y_train, y_validate = train_test_split(images, encoded_labels, test_size=0.1, stratify=labels, random_state=42)
# Data Augmentation
BATCH_SIZE = 64
train_datagen = ImageDataGenerator(
rotation_range=20,
rescale=1./255,
shear_range=0.15,
zoom_range=0.3,
horizontal_flip=True,
width_shift_range=0.15,
height_shift_range=0.15
)
train_generator = train_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE)
validate_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validate_datagen.flow(X_validate, y_validate, batch_size=BATCH_SIZE)
# Load Pre-trained Models and build custom models
base_models = [
keras.applications.VGG16(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.InceptionV3(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.ResNet50(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.DenseNet121(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.MobileNetV2(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.Xception(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.EfficientNetB0(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3))
]
earlystop = EarlyStopping(patience=10)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_loss', patience=2, verbose=1, factor=0.5, min_lr=0.00001)
models = []
history_data = []
for idx, base_model in enumerate(base_models):
model = keras.Sequential([
base_model,
GlobalAveragePooling2D(),
Dense(1024, activation='relu', kernel_regularizer=l2(0.01)), # Added L2 regularization
Dropout(0.3),
Dense(encoded_labels.shape[1], activation='softmax')
])
model.compile(optimizer=Adam(learning_rate=0.0005), loss='categorical_crossentropy', metrics=['accuracy'])
checkpoint = ModelCheckpoint(f"best_model_{idx}.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks = [earlystop, learning_rate_reduction, checkpoint]
history = model.fit(
train_generator,
epochs=10,
validation_data=validation_generator,
validation_steps=len(X_validate) // BATCH_SIZE,
callbacks=callbacks
)
models.append(model)
history_data.append(history)
# Plotting graphs post training
for idx, history in enumerate(history_data):
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title(f'Model {idx + 1} - Loss')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title(f'Model {idx + 1} - Accuracy')
plt.legend()
plt.tight_layout()
plt.show()
#new improved code here
import os
import numpy as np
import pandas as pd
import pydicom
import cv2
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from keras.optimizers import Adam
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.regularizers import l2
import matplotlib.pyplot as plt
# Sample a subset of the data
sample_trainingdata = training_data.groupby('class', group_keys=False).apply(lambda x: x.sample(1200))
# Preprocess DICOM images
ADJUSTED_IMAGE_SIZE = 224
def read_and_reshape_image(image):
img = np.array(image).astype(np.uint8)
res = cv2.resize(img, (ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE), interpolation=cv2.INTER_LINEAR)
return res
def populate_image(data):
images = []
labels = []
for index, row in data.iterrows():
patientId = row.patientId
classlabel = row["class"]
dcm_file = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/' + '{}.dcm'.format(patientId)
dcm_data = pydicom.read_file(dcm_file)
img = dcm_data.pixel_array
if len(img.shape) != 3 or img.shape[2] != 3:
img = np.stack((img,) * 3, -1)
images.append(read_and_reshape_image(img))
labels.append(classlabel)
images = np.array(images)
labels = np.array(labels)
return images, labels
images, labels = populate_image(sample_trainingdata)
# Encode the labels
enc = LabelBinarizer()
encoded_labels = enc.fit_transform(labels)
# Split the data
X_train, X_validate, y_train, y_validate = train_test_split(images, encoded_labels, test_size=0.1, stratify=labels, random_state=42)
# Data Augmentation
BATCH_SIZE = 64
train_datagen = ImageDataGenerator(
rotation_range=20,
rescale=1./255,
shear_range=0.15,
zoom_range=0.3,
horizontal_flip=True,
width_shift_range=0.15,
height_shift_range=0.15
)
train_generator = train_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE)
validate_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validate_datagen.flow(X_validate, y_validate, batch_size=BATCH_SIZE)
# Load Pre-trained Models and build custom models
base_models = [
keras.applications.VGG16(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.InceptionV3(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.ResNet50(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.DenseNet121(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.MobileNetV2(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.Xception(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)),
keras.applications.EfficientNetB0(include_top=False, input_shape=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3))
]
# Ensure base model layers are trainable
for base_model in base_models:
for layer in base_model.layers:
layer.trainable = True
earlystop = EarlyStopping(patience=10)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_loss', patience=2, verbose=1, factor=0.5, min_lr=0.00001)
# Learning rate scheduler
def scheduler(epoch, lr):
if epoch < 5:
return lr
else:
return lr * tf.math.exp(-0.1)
lr_schedule_callback = tf.keras.callbacks.LearningRateScheduler(scheduler)
models = []
history_data = []
for idx, base_model in enumerate(base_models):
model = keras.Sequential([
base_model,
GlobalAveragePooling2D(),
Dense(1024, activation='relu', kernel_regularizer=l2(0.01)),
Dropout(0.5),
Dense(512, activation='relu', kernel_regularizer=l2(0.01)),
Dropout(0.5),
Dense(encoded_labels.shape[1], activation='softmax')
])
model.compile(optimizer=Adam(learning_rate=0.0005), loss='categorical_crossentropy', metrics=['accuracy'])
checkpoint = ModelCheckpoint(f"best_model_{idx}.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks = [earlystop, learning_rate_reduction, checkpoint, lr_schedule_callback]
history = model.fit(
train_generator,
epochs=15,
validation_data=validation_generator,
validation_steps=len(X_validate) // BATCH_SIZE,
callbacks=callbacks
)
models.append(model)
history_data.append(history)
# Plotting graphs post training
for idx, history in enumerate(history_data):
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title(f'Model {idx + 1} - Loss')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title(f'Model {idx + 1} - Accuracy')
plt.legend()
plt.tight_layout()
plt.show()
Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 10.0243 - accuracy: 0.3420 Epoch 1: val_accuracy improved from -inf to 0.32812, saving model to best_model_0.h5 51/51 [==============================] - 1104s 22s/step - loss: 10.0243 - accuracy: 0.3420 - val_loss: 6.2028 - val_accuracy: 0.3281 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 4.4220 - accuracy: 0.3231 Epoch 2: val_accuracy improved from 0.32812 to 0.33125, saving model to best_model_0.h5 51/51 [==============================] - 1017s 20s/step - loss: 4.4220 - accuracy: 0.3231 - val_loss: 3.1524 - val_accuracy: 0.3313 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 2.5787 - accuracy: 0.3373 Epoch 3: val_accuracy did not improve from 0.33125 51/51 [==============================] - 1015s 20s/step - loss: 2.5787 - accuracy: 0.3373 - val_loss: 2.1535 - val_accuracy: 0.3313 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 1.9382 - accuracy: 0.3148 Epoch 4: val_accuracy improved from 0.33125 to 0.33750, saving model to best_model_0.h5 51/51 [==============================] - 1023s 20s/step - loss: 1.9382 - accuracy: 0.3148 - val_loss: 1.7633 - val_accuracy: 0.3375 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 1.6543 - accuracy: 0.3154 Epoch 5: val_accuracy did not improve from 0.33750 51/51 [==============================] - 1024s 20s/step - loss: 1.6543 - accuracy: 0.3154 - val_loss: 1.5603 - val_accuracy: 0.3344 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 1.4995 - accuracy: 0.3204 Epoch 6: val_accuracy improved from 0.33750 to 0.34062, saving model to best_model_0.h5 51/51 [==============================] - 1026s 20s/step - loss: 1.4995 - accuracy: 0.3204 - val_loss: 1.4431 - val_accuracy: 0.3406 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 1.4039 - accuracy: 0.3364 Epoch 7: val_accuracy improved from 0.34062 to 0.35625, saving model to best_model_0.h5 51/51 [==============================] - 1036s 20s/step - loss: 1.4039 - accuracy: 0.3364 - val_loss: 1.3671 - val_accuracy: 0.3562 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 1.3406 - accuracy: 0.3346 Epoch 8: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1034s 20s/step - loss: 1.3406 - accuracy: 0.3346 - val_loss: 1.3146 - val_accuracy: 0.3375 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 1.2956 - accuracy: 0.3278 Epoch 9: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1026s 20s/step - loss: 1.2956 - accuracy: 0.3278 - val_loss: 1.2769 - val_accuracy: 0.3313 - lr: 3.3516e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 1.2628 - accuracy: 0.3346 Epoch 10: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1030s 20s/step - loss: 1.2628 - accuracy: 0.3346 - val_loss: 1.2487 - val_accuracy: 0.3438 - lr: 3.0327e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 1.2381 - accuracy: 0.3241 Epoch 11: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1036s 20s/step - loss: 1.2381 - accuracy: 0.3241 - val_loss: 1.2274 - val_accuracy: 0.3375 - lr: 2.7441e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 1.2191 - accuracy: 0.3318 Epoch 12: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1036s 20s/step - loss: 1.2191 - accuracy: 0.3318 - val_loss: 1.2106 - val_accuracy: 0.3406 - lr: 2.4829e-04 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 1.2040 - accuracy: 0.3296 Epoch 13: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1040s 20s/step - loss: 1.2040 - accuracy: 0.3296 - val_loss: 1.1974 - val_accuracy: 0.3313 - lr: 2.2466e-04 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 1.1921 - accuracy: 0.3318 Epoch 14: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1031s 20s/step - loss: 1.1921 - accuracy: 0.3318 - val_loss: 1.1868 - val_accuracy: 0.3313 - lr: 2.0328e-04 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 1.1825 - accuracy: 0.3204 Epoch 15: val_accuracy did not improve from 0.35625 51/51 [==============================] - 1017s 20s/step - loss: 1.1825 - accuracy: 0.3204 - val_loss: 1.1780 - val_accuracy: 0.3375 - lr: 1.8394e-04 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 15.0431 - accuracy: 0.5596 Epoch 1: val_accuracy improved from -inf to 0.54375, saving model to best_model_1.h5 51/51 [==============================] - 320s 6s/step - loss: 15.0431 - accuracy: 0.5596 - val_loss: 9.6267 - val_accuracy: 0.5437 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 6.2844 - accuracy: 0.6534 Epoch 2: val_accuracy did not improve from 0.54375 51/51 [==============================] - 306s 6s/step - loss: 6.2844 - accuracy: 0.6534 - val_loss: 5.3286 - val_accuracy: 0.4563 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 2.9515 - accuracy: 0.6759 Epoch 3: val_accuracy improved from 0.54375 to 0.56875, saving model to best_model_1.h5 51/51 [==============================] - 310s 6s/step - loss: 2.9515 - accuracy: 0.6759 - val_loss: 3.2030 - val_accuracy: 0.5688 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 1.7172 - accuracy: 0.6988 Epoch 4: val_accuracy improved from 0.56875 to 0.60938, saving model to best_model_1.h5 51/51 [==============================] - 312s 6s/step - loss: 1.7172 - accuracy: 0.6988 - val_loss: 1.8372 - val_accuracy: 0.6094 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 1.2194 - accuracy: 0.7062 Epoch 5: val_accuracy improved from 0.60938 to 0.73438, saving model to best_model_1.h5 51/51 [==============================] - 310s 6s/step - loss: 1.2194 - accuracy: 0.7062 - val_loss: 1.1935 - val_accuracy: 0.7344 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 0.9701 - accuracy: 0.7244 Epoch 6: val_accuracy did not improve from 0.73438 51/51 [==============================] - 308s 6s/step - loss: 0.9701 - accuracy: 0.7244 - val_loss: 1.3519 - val_accuracy: 0.6281 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 0.8559 - accuracy: 0.7259 Epoch 7: val_accuracy did not improve from 0.73438 51/51 [==============================] - 308s 6s/step - loss: 0.8559 - accuracy: 0.7259 - val_loss: 0.8271 - val_accuracy: 0.7219 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 0.7578 - accuracy: 0.7401 Epoch 8: val_accuracy did not improve from 0.73438 51/51 [==============================] - 311s 6s/step - loss: 0.7578 - accuracy: 0.7401 - val_loss: 0.8730 - val_accuracy: 0.6938 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 0.7016 - accuracy: 0.7519 Epoch 9: val_accuracy did not improve from 0.73438 51/51 [==============================] - 311s 6s/step - loss: 0.7016 - accuracy: 0.7519 - val_loss: 0.8014 - val_accuracy: 0.7250 - lr: 3.3516e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.6730 - accuracy: 0.7685 Epoch 10: val_accuracy did not improve from 0.73438 51/51 [==============================] - 311s 6s/step - loss: 0.6730 - accuracy: 0.7685 - val_loss: 0.9586 - val_accuracy: 0.6062 - lr: 3.0327e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.6067 - accuracy: 0.7883 Epoch 11: val_accuracy improved from 0.73438 to 0.74375, saving model to best_model_1.h5 51/51 [==============================] - 313s 6s/step - loss: 0.6067 - accuracy: 0.7883 - val_loss: 0.6908 - val_accuracy: 0.7437 - lr: 2.7441e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.5843 - accuracy: 0.7858 Epoch 12: val_accuracy did not improve from 0.74375 51/51 [==============================] - 313s 6s/step - loss: 0.5843 - accuracy: 0.7858 - val_loss: 0.8245 - val_accuracy: 0.6969 - lr: 2.4829e-04 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.5327 - accuracy: 0.8102 Epoch 13: ReduceLROnPlateau reducing learning rate to 0.00011233225814066827. Epoch 13: val_accuracy did not improve from 0.74375 51/51 [==============================] - 314s 6s/step - loss: 0.5327 - accuracy: 0.8102 - val_loss: 0.7808 - val_accuracy: 0.7031 - lr: 1.1233e-04 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.4987 - accuracy: 0.8262 Epoch 14: val_accuracy did not improve from 0.74375 51/51 [==============================] - 319s 6s/step - loss: 0.4987 - accuracy: 0.8262 - val_loss: 0.7436 - val_accuracy: 0.7156 - lr: 1.0164e-04 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.4583 - accuracy: 0.8340 Epoch 15: ReduceLROnPlateau reducing learning rate to 4.5984936150489375e-05. Epoch 15: val_accuracy did not improve from 0.74375 51/51 [==============================] - 313s 6s/step - loss: 0.4583 - accuracy: 0.8340 - val_loss: 0.8496 - val_accuracy: 0.7063 - lr: 4.5985e-05 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 16.7134 - accuracy: 0.5620 Epoch 1: val_accuracy improved from -inf to 0.32188, saving model to best_model_2.h5 51/51 [==============================] - 574s 11s/step - loss: 16.7134 - accuracy: 0.5620 - val_loss: 12.5021 - val_accuracy: 0.3219 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 8.7794 - accuracy: 0.6358 Epoch 2: val_accuracy improved from 0.32188 to 0.33750, saving model to best_model_2.h5 51/51 [==============================] - 540s 11s/step - loss: 8.7794 - accuracy: 0.6358 - val_loss: 7.1043 - val_accuracy: 0.3375 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 4.7610 - accuracy: 0.6772 Epoch 3: val_accuracy did not improve from 0.33750 51/51 [==============================] - 540s 11s/step - loss: 4.7610 - accuracy: 0.6772 - val_loss: 5.1726 - val_accuracy: 0.3375 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 2.8472 - accuracy: 0.6966 Epoch 4: val_accuracy did not improve from 0.33750 51/51 [==============================] - 536s 11s/step - loss: 2.8472 - accuracy: 0.6966 - val_loss: 2.7983 - val_accuracy: 0.3313 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 1.9055 - accuracy: 0.7000 Epoch 5: val_accuracy improved from 0.33750 to 0.35938, saving model to best_model_2.h5 51/51 [==============================] - 535s 10s/step - loss: 1.9055 - accuracy: 0.7000 - val_loss: 2.0860 - val_accuracy: 0.3594 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 1.4030 - accuracy: 0.7080 Epoch 6: val_accuracy did not improve from 0.35938 51/51 [==============================] - 553s 11s/step - loss: 1.4030 - accuracy: 0.7080 - val_loss: 2.1063 - val_accuracy: 0.3344 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 1.1496 - accuracy: 0.7117 Epoch 7: ReduceLROnPlateau reducing learning rate to 0.0002046826994046569. Epoch 7: val_accuracy did not improve from 0.35938 51/51 [==============================] - 555s 11s/step - loss: 1.1496 - accuracy: 0.7117 - val_loss: 2.3836 - val_accuracy: 0.3125 - lr: 2.0468e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 0.9997 - accuracy: 0.7324 Epoch 8: val_accuracy did not improve from 0.35938 51/51 [==============================] - 529s 10s/step - loss: 0.9997 - accuracy: 0.7324 - val_loss: 1.9866 - val_accuracy: 0.3438 - lr: 1.8520e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 0.9270 - accuracy: 0.7312 Epoch 9: val_accuracy did not improve from 0.35938 51/51 [==============================] - 528s 10s/step - loss: 0.9270 - accuracy: 0.7312 - val_loss: 1.8406 - val_accuracy: 0.3281 - lr: 1.6758e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.8744 - accuracy: 0.7485 Epoch 10: val_accuracy did not improve from 0.35938 51/51 [==============================] - 527s 10s/step - loss: 0.8744 - accuracy: 0.7485 - val_loss: 1.9152 - val_accuracy: 0.3406 - lr: 1.5163e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.8280 - accuracy: 0.7519 Epoch 11: ReduceLROnPlateau reducing learning rate to 6.860146822873503e-05. Epoch 11: val_accuracy did not improve from 0.35938 51/51 [==============================] - 527s 10s/step - loss: 0.8280 - accuracy: 0.7519 - val_loss: 2.2117 - val_accuracy: 0.3500 - lr: 6.8601e-05 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.7625 - accuracy: 0.7759 Epoch 12: val_accuracy did not improve from 0.35938 51/51 [==============================] - 527s 10s/step - loss: 0.7625 - accuracy: 0.7759 - val_loss: 1.7030 - val_accuracy: 0.3219 - lr: 6.2073e-05 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.7462 - accuracy: 0.7818 Epoch 13: val_accuracy did not improve from 0.35938 51/51 [==============================] - 528s 10s/step - loss: 0.7462 - accuracy: 0.7818 - val_loss: 1.9637 - val_accuracy: 0.3406 - lr: 5.6166e-05 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.7169 - accuracy: 0.7920 Epoch 14: ReduceLROnPlateau reducing learning rate to 2.54106071224669e-05. Epoch 14: val_accuracy improved from 0.35938 to 0.36250, saving model to best_model_2.h5 51/51 [==============================] - 531s 10s/step - loss: 0.7169 - accuracy: 0.7920 - val_loss: 2.2536 - val_accuracy: 0.3625 - lr: 2.5411e-05 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.6988 - accuracy: 0.7898 Epoch 15: val_accuracy improved from 0.36250 to 0.40000, saving model to best_model_2.h5 51/51 [==============================] - 529s 10s/step - loss: 0.6988 - accuracy: 0.7898 - val_loss: 2.0744 - val_accuracy: 0.4000 - lr: 2.2992e-05 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 13.2123 - accuracy: 0.5583 Epoch 1: val_accuracy improved from -inf to 0.34062, saving model to best_model_3.h5 51/51 [==============================] - 759s 14s/step - loss: 13.2123 - accuracy: 0.5583 - val_loss: 13.1888 - val_accuracy: 0.3406 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 7.1449 - accuracy: 0.6488 Epoch 2: val_accuracy improved from 0.34062 to 0.51875, saving model to best_model_3.h5 51/51 [==============================] - 720s 14s/step - loss: 7.1449 - accuracy: 0.6488 - val_loss: 6.6150 - val_accuracy: 0.5188 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 4.6028 - accuracy: 0.6781 Epoch 3: val_accuracy did not improve from 0.51875 51/51 [==============================] - 718s 14s/step - loss: 4.6028 - accuracy: 0.6781 - val_loss: 4.5871 - val_accuracy: 0.5094 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 3.2669 - accuracy: 0.7019 Epoch 4: val_accuracy improved from 0.51875 to 0.67813, saving model to best_model_3.h5 51/51 [==============================] - 720s 14s/step - loss: 3.2669 - accuracy: 0.7019 - val_loss: 2.8477 - val_accuracy: 0.6781 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 2.4445 - accuracy: 0.6966 Epoch 5: val_accuracy did not improve from 0.67813 51/51 [==============================] - 718s 14s/step - loss: 2.4445 - accuracy: 0.6966 - val_loss: 2.5494 - val_accuracy: 0.6469 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 1.8729 - accuracy: 0.7216 Epoch 6: val_accuracy improved from 0.67813 to 0.69063, saving model to best_model_3.h5 51/51 [==============================] - 719s 14s/step - loss: 1.8729 - accuracy: 0.7216 - val_loss: 1.8978 - val_accuracy: 0.6906 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 1.5224 - accuracy: 0.7324 Epoch 7: val_accuracy did not improve from 0.69063 51/51 [==============================] - 719s 14s/step - loss: 1.5224 - accuracy: 0.7324 - val_loss: 1.4938 - val_accuracy: 0.6656 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 1.2977 - accuracy: 0.7441 Epoch 8: val_accuracy did not improve from 0.69063 51/51 [==============================] - 719s 14s/step - loss: 1.2977 - accuracy: 0.7441 - val_loss: 1.3407 - val_accuracy: 0.6781 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 1.1288 - accuracy: 0.7352 Epoch 9: val_accuracy did not improve from 0.69063 51/51 [==============================] - 726s 14s/step - loss: 1.1288 - accuracy: 0.7352 - val_loss: 1.3522 - val_accuracy: 0.5938 - lr: 3.3516e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.9979 - accuracy: 0.7522 Epoch 10: val_accuracy improved from 0.69063 to 0.70625, saving model to best_model_3.h5 51/51 [==============================] - 716s 14s/step - loss: 0.9979 - accuracy: 0.7522 - val_loss: 1.1427 - val_accuracy: 0.7063 - lr: 3.0327e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.9039 - accuracy: 0.7679 Epoch 11: val_accuracy improved from 0.70625 to 0.71562, saving model to best_model_3.h5 51/51 [==============================] - 716s 14s/step - loss: 0.9039 - accuracy: 0.7679 - val_loss: 1.0376 - val_accuracy: 0.7156 - lr: 2.7441e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.8046 - accuracy: 0.7877 Epoch 12: val_accuracy did not improve from 0.71562 51/51 [==============================] - 716s 14s/step - loss: 0.8046 - accuracy: 0.7877 - val_loss: 1.0514 - val_accuracy: 0.6719 - lr: 2.4829e-04 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.7561 - accuracy: 0.7864 Epoch 13: val_accuracy did not improve from 0.71562 51/51 [==============================] - 714s 14s/step - loss: 0.7561 - accuracy: 0.7864 - val_loss: 0.9872 - val_accuracy: 0.6781 - lr: 2.2466e-04 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.7009 - accuracy: 0.7985 Epoch 14: val_accuracy did not improve from 0.71562 51/51 [==============================] - 717s 14s/step - loss: 0.7009 - accuracy: 0.7985 - val_loss: 1.1388 - val_accuracy: 0.6250 - lr: 2.0328e-04 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.6514 - accuracy: 0.8052 Epoch 15: val_accuracy did not improve from 0.71562 51/51 [==============================] - 714s 14s/step - loss: 0.6514 - accuracy: 0.8052 - val_loss: 0.9000 - val_accuracy: 0.7000 - lr: 1.8394e-04 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 15.0092 - accuracy: 0.5731 Epoch 1: val_accuracy improved from -inf to 0.43437, saving model to best_model_4.h5 51/51 [==============================] - 197s 4s/step - loss: 15.0092 - accuracy: 0.5731 - val_loss: 12.5494 - val_accuracy: 0.4344 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 8.0661 - accuracy: 0.6716 Epoch 2: val_accuracy improved from 0.43437 to 0.46250, saving model to best_model_4.h5 51/51 [==============================] - 185s 4s/step - loss: 8.0661 - accuracy: 0.6716 - val_loss: 6.8538 - val_accuracy: 0.4625 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 4.3539 - accuracy: 0.6870 Epoch 3: val_accuracy improved from 0.46250 to 0.60625, saving model to best_model_4.h5 51/51 [==============================] - 187s 4s/step - loss: 4.3539 - accuracy: 0.6870 - val_loss: 3.8020 - val_accuracy: 0.6062 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 2.5290 - accuracy: 0.7090 Epoch 4: val_accuracy did not improve from 0.60625 51/51 [==============================] - 184s 4s/step - loss: 2.5290 - accuracy: 0.7090 - val_loss: 2.6001 - val_accuracy: 0.5969 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 1.6543 - accuracy: 0.7173 Epoch 5: val_accuracy did not improve from 0.60625 51/51 [==============================] - 217s 4s/step - loss: 1.6543 - accuracy: 0.7173 - val_loss: 1.9284 - val_accuracy: 0.6062 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 1.2319 - accuracy: 0.7275 Epoch 6: val_accuracy improved from 0.60625 to 0.63437, saving model to best_model_4.h5 51/51 [==============================] - 191s 4s/step - loss: 1.2319 - accuracy: 0.7275 - val_loss: 1.4207 - val_accuracy: 0.6344 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 0.9935 - accuracy: 0.7401 Epoch 7: val_accuracy did not improve from 0.63437 51/51 [==============================] - 207s 4s/step - loss: 0.9935 - accuracy: 0.7401 - val_loss: 1.4580 - val_accuracy: 0.5938 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 0.8542 - accuracy: 0.7571 Epoch 8: val_accuracy improved from 0.63437 to 0.64375, saving model to best_model_4.h5 51/51 [==============================] - 190s 4s/step - loss: 0.8542 - accuracy: 0.7571 - val_loss: 1.1581 - val_accuracy: 0.6438 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 0.7467 - accuracy: 0.7654 Epoch 9: val_accuracy did not improve from 0.64375 51/51 [==============================] - 189s 4s/step - loss: 0.7467 - accuracy: 0.7654 - val_loss: 1.2660 - val_accuracy: 0.6344 - lr: 3.3516e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.7025 - accuracy: 0.7778 Epoch 10: ReduceLROnPlateau reducing learning rate to 0.00015163268835749477. Epoch 10: val_accuracy did not improve from 0.64375 51/51 [==============================] - 193s 4s/step - loss: 0.7025 - accuracy: 0.7778 - val_loss: 1.1587 - val_accuracy: 0.5938 - lr: 1.5163e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.6185 - accuracy: 0.7880 Epoch 11: val_accuracy did not improve from 0.64375 51/51 [==============================] - 191s 4s/step - loss: 0.6185 - accuracy: 0.7880 - val_loss: 1.0284 - val_accuracy: 0.6219 - lr: 1.3720e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.5773 - accuracy: 0.8096 Epoch 12: val_accuracy did not improve from 0.64375 51/51 [==============================] - 198s 4s/step - loss: 0.5773 - accuracy: 0.8096 - val_loss: 0.9721 - val_accuracy: 0.6094 - lr: 1.2415e-04 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.5478 - accuracy: 0.8179 Epoch 13: val_accuracy did not improve from 0.64375 51/51 [==============================] - 208s 4s/step - loss: 0.5478 - accuracy: 0.8179 - val_loss: 1.0955 - val_accuracy: 0.6375 - lr: 1.1233e-04 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.5171 - accuracy: 0.8272 Epoch 14: ReduceLROnPlateau reducing learning rate to 5.08212142449338e-05. Epoch 14: val_accuracy did not improve from 0.64375 51/51 [==============================] - 199s 4s/step - loss: 0.5171 - accuracy: 0.8272 - val_loss: 1.3674 - val_accuracy: 0.5844 - lr: 5.0821e-05 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.4842 - accuracy: 0.8491 Epoch 15: val_accuracy did not improve from 0.64375 51/51 [==============================] - 192s 4s/step - loss: 0.4842 - accuracy: 0.8491 - val_loss: 1.2821 - val_accuracy: 0.5625 - lr: 4.5985e-05 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 12.9626 - accuracy: 0.5796 Epoch 1: val_accuracy improved from -inf to 0.54688, saving model to best_model_5.h5 51/51 [==============================] - 674s 13s/step - loss: 12.9626 - accuracy: 0.5796 - val_loss: 6.9100 - val_accuracy: 0.5469 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 3.9005 - accuracy: 0.6756 Epoch 2: val_accuracy improved from 0.54688 to 0.64062, saving model to best_model_5.h5 51/51 [==============================] - 751s 15s/step - loss: 3.9005 - accuracy: 0.6756 - val_loss: 2.5359 - val_accuracy: 0.6406 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 1.5490 - accuracy: 0.7074 Epoch 3: val_accuracy improved from 0.64062 to 0.64375, saving model to best_model_5.h5 51/51 [==============================] - 746s 15s/step - loss: 1.5490 - accuracy: 0.7074 - val_loss: 1.3549 - val_accuracy: 0.6438 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 0.9872 - accuracy: 0.7083 Epoch 4: val_accuracy improved from 0.64375 to 0.69687, saving model to best_model_5.h5 51/51 [==============================] - 716s 14s/step - loss: 0.9872 - accuracy: 0.7083 - val_loss: 0.9322 - val_accuracy: 0.6969 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 0.8019 - accuracy: 0.7269 Epoch 5: val_accuracy did not improve from 0.69687 51/51 [==============================] - 777s 15s/step - loss: 0.8019 - accuracy: 0.7269 - val_loss: 1.0441 - val_accuracy: 0.6938 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 0.6852 - accuracy: 0.7556 Epoch 6: val_accuracy improved from 0.69687 to 0.70312, saving model to best_model_5.h5 51/51 [==============================] - 779s 15s/step - loss: 0.6852 - accuracy: 0.7556 - val_loss: 0.8748 - val_accuracy: 0.7031 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 0.6191 - accuracy: 0.7670 Epoch 7: val_accuracy did not improve from 0.70312 51/51 [==============================] - 789s 15s/step - loss: 0.6191 - accuracy: 0.7670 - val_loss: 1.6563 - val_accuracy: 0.5750 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 0.5801 - accuracy: 0.7873 Epoch 8: val_accuracy improved from 0.70312 to 0.75625, saving model to best_model_5.h5 51/51 [==============================] - 818s 16s/step - loss: 0.5801 - accuracy: 0.7873 - val_loss: 0.6758 - val_accuracy: 0.7563 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 0.5362 - accuracy: 0.8000 Epoch 9: val_accuracy did not improve from 0.75625 51/51 [==============================] - 803s 16s/step - loss: 0.5362 - accuracy: 0.8000 - val_loss: 0.8241 - val_accuracy: 0.7094 - lr: 3.3516e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.4978 - accuracy: 0.8160 Epoch 10: ReduceLROnPlateau reducing learning rate to 0.00015163268835749477. Epoch 10: val_accuracy did not improve from 0.75625 51/51 [==============================] - 789s 15s/step - loss: 0.4978 - accuracy: 0.8160 - val_loss: 0.8618 - val_accuracy: 0.7531 - lr: 1.5163e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.4133 - accuracy: 0.8500 Epoch 11: val_accuracy did not improve from 0.75625 51/51 [==============================] - 764s 15s/step - loss: 0.4133 - accuracy: 0.8500 - val_loss: 0.8458 - val_accuracy: 0.6938 - lr: 1.3720e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.3889 - accuracy: 0.8627 Epoch 12: ReduceLROnPlateau reducing learning rate to 6.207317346706986e-05. Epoch 12: val_accuracy did not improve from 0.75625 51/51 [==============================] - 759s 15s/step - loss: 0.3889 - accuracy: 0.8627 - val_loss: 0.9843 - val_accuracy: 0.7094 - lr: 6.2073e-05 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.3239 - accuracy: 0.8923 Epoch 13: val_accuracy did not improve from 0.75625 51/51 [==============================] - 756s 15s/step - loss: 0.3239 - accuracy: 0.8923 - val_loss: 0.9007 - val_accuracy: 0.7375 - lr: 5.6166e-05 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.2983 - accuracy: 0.9003 Epoch 14: ReduceLROnPlateau reducing learning rate to 2.54106071224669e-05. Epoch 14: val_accuracy did not improve from 0.75625 51/51 [==============================] - 766s 15s/step - loss: 0.2983 - accuracy: 0.9003 - val_loss: 0.9214 - val_accuracy: 0.7281 - lr: 2.5411e-05 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.2752 - accuracy: 0.9154 Epoch 15: val_accuracy did not improve from 0.75625 51/51 [==============================] - 783s 15s/step - loss: 0.2752 - accuracy: 0.9154 - val_loss: 0.8540 - val_accuracy: 0.7250 - lr: 2.2992e-05 Epoch 1/15 51/51 [==============================] - ETA: 0s - loss: 13.6382 - accuracy: 0.5818 Epoch 1: val_accuracy improved from -inf to 0.33750, saving model to best_model_6.h5 51/51 [==============================] - 357s 7s/step - loss: 13.6382 - accuracy: 0.5818 - val_loss: 9.2772 - val_accuracy: 0.3375 - lr: 5.0000e-04 Epoch 2/15 51/51 [==============================] - ETA: 0s - loss: 6.0596 - accuracy: 0.6796 Epoch 2: val_accuracy did not improve from 0.33750 51/51 [==============================] - 339s 7s/step - loss: 6.0596 - accuracy: 0.6796 - val_loss: 4.3454 - val_accuracy: 0.3375 - lr: 5.0000e-04 Epoch 3/15 51/51 [==============================] - ETA: 0s - loss: 2.8116 - accuracy: 0.6907 Epoch 3: val_accuracy improved from 0.33750 to 0.35313, saving model to best_model_6.h5 51/51 [==============================] - 344s 7s/step - loss: 2.8116 - accuracy: 0.6907 - val_loss: 2.4040 - val_accuracy: 0.3531 - lr: 5.0000e-04 Epoch 4/15 51/51 [==============================] - ETA: 0s - loss: 1.5242 - accuracy: 0.7065 Epoch 4: val_accuracy improved from 0.35313 to 0.38125, saving model to best_model_6.h5 51/51 [==============================] - 347s 7s/step - loss: 1.5242 - accuracy: 0.7065 - val_loss: 1.6692 - val_accuracy: 0.3812 - lr: 5.0000e-04 Epoch 5/15 51/51 [==============================] - ETA: 0s - loss: 1.0320 - accuracy: 0.7253 Epoch 5: val_accuracy did not improve from 0.38125 51/51 [==============================] - 342s 7s/step - loss: 1.0320 - accuracy: 0.7253 - val_loss: 1.4024 - val_accuracy: 0.3562 - lr: 5.0000e-04 Epoch 6/15 51/51 [==============================] - ETA: 0s - loss: 0.8159 - accuracy: 0.7343 Epoch 6: val_accuracy did not improve from 0.38125 51/51 [==============================] - 342s 7s/step - loss: 0.8159 - accuracy: 0.7343 - val_loss: 1.2824 - val_accuracy: 0.3406 - lr: 4.5242e-04 Epoch 7/15 51/51 [==============================] - ETA: 0s - loss: 0.7323 - accuracy: 0.7417 Epoch 7: val_accuracy did not improve from 0.38125 51/51 [==============================] - 340s 7s/step - loss: 0.7323 - accuracy: 0.7417 - val_loss: 1.2355 - val_accuracy: 0.3688 - lr: 4.0937e-04 Epoch 8/15 51/51 [==============================] - ETA: 0s - loss: 0.6458 - accuracy: 0.7660 Epoch 8: val_accuracy did not improve from 0.38125 51/51 [==============================] - 336s 7s/step - loss: 0.6458 - accuracy: 0.7660 - val_loss: 1.2614 - val_accuracy: 0.3125 - lr: 3.7041e-04 Epoch 9/15 51/51 [==============================] - ETA: 0s - loss: 0.6105 - accuracy: 0.7778 Epoch 9: ReduceLROnPlateau reducing learning rate to 0.0001675800303928554. Epoch 9: val_accuracy did not improve from 0.38125 51/51 [==============================] - 342s 7s/step - loss: 0.6105 - accuracy: 0.7778 - val_loss: 12.2768 - val_accuracy: 0.3469 - lr: 1.6758e-04 Epoch 10/15 51/51 [==============================] - ETA: 0s - loss: 0.5661 - accuracy: 0.7935 Epoch 10: val_accuracy did not improve from 0.38125 51/51 [==============================] - 344s 7s/step - loss: 0.5661 - accuracy: 0.7935 - val_loss: 1.4679 - val_accuracy: 0.3594 - lr: 1.5163e-04 Epoch 11/15 51/51 [==============================] - ETA: 0s - loss: 0.5423 - accuracy: 0.7966 Epoch 11: val_accuracy improved from 0.38125 to 0.42188, saving model to best_model_6.h5 51/51 [==============================] - 342s 7s/step - loss: 0.5423 - accuracy: 0.7966 - val_loss: 1.1372 - val_accuracy: 0.4219 - lr: 1.3720e-04 Epoch 12/15 51/51 [==============================] - ETA: 0s - loss: 0.5143 - accuracy: 0.8056 Epoch 12: val_accuracy did not improve from 0.42188 51/51 [==============================] - 339s 7s/step - loss: 0.5143 - accuracy: 0.8056 - val_loss: 1.1619 - val_accuracy: 0.3187 - lr: 1.2415e-04 Epoch 13/15 51/51 [==============================] - ETA: 0s - loss: 0.4905 - accuracy: 0.8160 Epoch 13: ReduceLROnPlateau reducing learning rate to 5.6166129070334136e-05. Epoch 13: val_accuracy did not improve from 0.42188 51/51 [==============================] - 339s 7s/step - loss: 0.4905 - accuracy: 0.8160 - val_loss: 1.3566 - val_accuracy: 0.3313 - lr: 5.6166e-05 Epoch 14/15 51/51 [==============================] - ETA: 0s - loss: 0.4723 - accuracy: 0.8253 Epoch 14: val_accuracy did not improve from 0.42188 51/51 [==============================] - 335s 7s/step - loss: 0.4723 - accuracy: 0.8253 - val_loss: 3.2995 - val_accuracy: 0.3688 - lr: 5.0821e-05 Epoch 15/15 51/51 [==============================] - ETA: 0s - loss: 0.4661 - accuracy: 0.8262 Epoch 15: val_accuracy improved from 0.42188 to 0.42812, saving model to best_model_6.h5 51/51 [==============================] - 338s 7s/step - loss: 0.4661 - accuracy: 0.8262 - val_loss: 1.0974 - val_accuracy: 0.4281 - lr: 4.5985e-05
import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# List of model files
model_files = [
"best_model_0.h5",
"best_model_1.h5",
"best_model_2.h5",
"best_model_3.h5",
"best_model_4.h5",
"best_model_5.h5",
"best_model_6.h5"
]
# Names of the models for display
model_names = [
"VGG16",
"InceptionV3",
"ResNet50",
"DenseNet121",
"MobileNetV2",
"Xception",
"EfficientNetB0"
]
# Image scaling as done during training/validation
validate_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validate_datagen.flow(X_validate, y_validate, batch_size=64, shuffle=False)
# Resize test data to match the model's expected input shape
X_test_resized = np.array([cv2.resize(img, (ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE)) for img in X_test])
# Loop through each model file and display its summary
for file, name in zip(model_files, model_names):
print(f"===== Model: {name} =====")
model = load_model(file)
model.summary()
print("\n\n")
# Compute validation accuracy using the generator
val_loss, val_accuracy = model.evaluate(validation_generator, verbose=0)
print(f"\nValidation Accuracy for {name}: {val_accuracy:.4f}\n\n")
# Compute test accuracy
# Note: You should similarly use a scaled test generator for this line if you uncomment it
#test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
#print(f"Test Accuracy for {name}: {test_accuracy:.4f}\n\n")
# Compute test accuracy using resized test data
test_loss, test_accuracy = model.evaluate(X_test_resized, y_test, verbose=0)
print(f"Test Accuracy for {name}: {test_accuracy:.4f}\n\n")
===== Model: VGG16 =====
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 7, 7, 512) 14714688
global_average_pooling2d_2 (None, 512) 0
(GlobalAveragePooling2D)
dense_12 (Dense) (None, 1024) 525312
dropout_16 (Dropout) (None, 1024) 0
dense_13 (Dense) (None, 512) 524800
dropout_17 (Dropout) (None, 512) 0
dense_14 (Dense) (None, 3) 1539
=================================================================
Total params: 15766339 (60.14 MB)
Trainable params: 15766339 (60.14 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Validation Accuracy for VGG16: 0.3333
Test Accuracy for VGG16: 0.2978
===== Model: InceptionV3 =====
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
inception_v3 (Functional) (None, 5, 5, 2048) 21802784
global_average_pooling2d_3 (None, 2048) 0
(GlobalAveragePooling2D)
dense_15 (Dense) (None, 1024) 2098176
dropout_18 (Dropout) (None, 1024) 0
dense_16 (Dense) (None, 512) 524800
dropout_19 (Dropout) (None, 512) 0
dense_17 (Dense) (None, 3) 1539
=================================================================
Total params: 24427299 (93.18 MB)
Trainable params: 24392867 (93.05 MB)
Non-trainable params: 34432 (134.50 KB)
_________________________________________________________________
Validation Accuracy for InceptionV3: 0.7250
Test Accuracy for InceptionV3: 0.3422
===== Model: ResNet50 =====
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50 (Functional) (None, 7, 7, 2048) 23587712
global_average_pooling2d_4 (None, 2048) 0
(GlobalAveragePooling2D)
dense_18 (Dense) (None, 1024) 2098176
dropout_20 (Dropout) (None, 1024) 0
dense_19 (Dense) (None, 512) 524800
dropout_21 (Dropout) (None, 512) 0
dense_20 (Dense) (None, 3) 1539
=================================================================
Total params: 26212227 (99.99 MB)
Trainable params: 26159107 (99.79 MB)
Non-trainable params: 53120 (207.50 KB)
_________________________________________________________________
Validation Accuracy for ResNet50: 0.4000
Test Accuracy for ResNet50: 0.3600
===== Model: DenseNet121 =====
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
densenet121 (Functional) (None, 7, 7, 1024) 7037504
global_average_pooling2d_5 (None, 1024) 0
(GlobalAveragePooling2D)
dense_21 (Dense) (None, 1024) 1049600
dropout_22 (Dropout) (None, 1024) 0
dense_22 (Dense) (None, 512) 524800
dropout_23 (Dropout) (None, 512) 0
dense_23 (Dense) (None, 3) 1539
=================================================================
Total params: 8613443 (32.86 MB)
Trainable params: 8529795 (32.54 MB)
Non-trainable params: 83648 (326.75 KB)
_________________________________________________________________
Validation Accuracy for DenseNet121: 0.7028
Test Accuracy for DenseNet121: 0.3600
===== Model: MobileNetV2 =====
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
mobilenetv2_1.00_224 (Func (None, 7, 7, 1280) 2257984
tional)
global_average_pooling2d_6 (None, 1280) 0
(GlobalAveragePooling2D)
dense_24 (Dense) (None, 1024) 1311744
dropout_24 (Dropout) (None, 1024) 0
dense_25 (Dense) (None, 512) 524800
dropout_25 (Dropout) (None, 512) 0
dense_26 (Dense) (None, 3) 1539
=================================================================
Total params: 4096067 (15.63 MB)
Trainable params: 4061955 (15.50 MB)
Non-trainable params: 34112 (133.25 KB)
_________________________________________________________________
Validation Accuracy for MobileNetV2: 0.6417
Test Accuracy for MobileNetV2: 0.3644
===== Model: Xception =====
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
xception (Functional) (None, 7, 7, 2048) 20861480
global_average_pooling2d_7 (None, 2048) 0
(GlobalAveragePooling2D)
dense_27 (Dense) (None, 1024) 2098176
dropout_26 (Dropout) (None, 1024) 0
dense_28 (Dense) (None, 512) 524800
dropout_27 (Dropout) (None, 512) 0
dense_29 (Dense) (None, 3) 1539
=================================================================
Total params: 23485995 (89.59 MB)
Trainable params: 23431467 (89.38 MB)
Non-trainable params: 54528 (213.00 KB)
_________________________________________________________________
Validation Accuracy for Xception: 0.7472
Test Accuracy for Xception: 0.3422
===== Model: EfficientNetB0 =====
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
efficientnetb0 (Functional (None, 7, 7, 1280) 4049571
)
global_average_pooling2d_8 (None, 1280) 0
(GlobalAveragePooling2D)
dense_30 (Dense) (None, 1024) 1311744
dropout_28 (Dropout) (None, 1024) 0
dense_31 (Dense) (None, 512) 524800
dropout_29 (Dropout) (None, 512) 0
dense_32 (Dense) (None, 3) 1539
=================================================================
Total params: 5887654 (22.46 MB)
Trainable params: 5845631 (22.30 MB)
Non-trainable params: 42023 (164.16 KB)
_________________________________________________________________
Validation Accuracy for EfficientNetB0: 0.4194
Test Accuracy for EfficientNetB0: 0.3333
from tensorflow.keras.models import load_model
# Load the model
model = load_model('best_model_1.h5')
# Note: If you have not applied transformations on your validation set while training, make sure you use validate_datagen for evaluation.
loss, accuracy = model.evaluate(validation_generator)
print(f"Validation Loss: {loss}")
print(f"Validation Accuracy: {accuracy * 100}%")
# Compute test accuracy
test_loss, test_accuracy = loaded_model.evaluate(X_test, y_test, verbose=0)
print(f"Test Accuracy: {test_accuracy:.4f}")
import os
import numpy as np
import pandas as pd
import pydicom
import cv2
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from keras.optimizers import Adam
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.regularizers import l2
import matplotlib.pyplot as plt
# Sample a subset of the data
sample_trainingdata = training_data.groupby('class', group_keys=False).apply(lambda x: x.sample(4000))
# Preprocess DICOM images
ADJUSTED_IMAGE_SIZE = 128
def read_and_reshape_image(image):
img = np.array(image).astype(np.uint8)
res = cv2.resize(img, (ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE), interpolation=cv2.INTER_LINEAR)
return res
def populate_image(data):
images = []
labels = []
for index, row in data.iterrows():
patientId = row.patientId
classlabel = row["class"]
dcm_file = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images/' + '{}.dcm'.format(patientId)
dcm_data = pydicom.read_file(dcm_file)
img = dcm_data.pixel_array
if len(img.shape) != 3 or img.shape[2] != 3:
img = np.stack((img,) * 3, -1)
images.append(read_and_reshape_image(img))
labels.append(classlabel)
images = np.array(images)
labels = np.array(labels)
return images, labels
images, labels = populate_image(sample_trainingdata)
# Encode the labels
enc = LabelBinarizer()
encoded_labels = enc.fit_transform(labels)
# Split the data
X_train, X_validate, y_train, y_validate = train_test_split(images, encoded_labels, test_size=0.1, stratify=labels, random_state=42)
# Data Augmentation
BATCH_SIZE = 256
train_datagen = ImageDataGenerator(
rotation_range=20,
rescale=1./255,
shear_range=0.15,
zoom_range=0.3,
horizontal_flip=True,
width_shift_range=0.15,
height_shift_range=0.15
)
train_generator = train_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE)
validate_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validate_datagen.flow(X_validate, y_validate, batch_size=BATCH_SIZE)
# U-Net Model
def unet_model(input_size=(ADJUSTED_IMAGE_SIZE, ADJUSTED_IMAGE_SIZE, 3)): # Adjusted the input size
inputs = keras.layers.Input(input_size)
# Contracting path
c1 = keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same')(inputs)
p1 = keras.layers.MaxPooling2D((2, 2))(c1)
c2 = keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same')(p1)
p2 = keras.layers.MaxPooling2D((2, 2))(c2)
# At the lowest level
c3 = keras.layers.Conv2D(256, (3, 3), activation='relu', padding='same')(p2)
# Expansive path
u2 = keras.layers.UpSampling2D((2, 2))(c3)
c4 = keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same')(u2)
u1 = keras.layers.UpSampling2D((2, 2))(c4)
c5 = keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same')(u1)
# Flatten the output and pass it through dense layers for classification
flat = keras.layers.Flatten()(c5)
dense1 = keras.layers.Dense(128, activation='relu')(flat)
dropout = keras.layers.Dropout(0.5)(dense1)
outputs = keras.layers.Dense(3, activation='softmax')(dropout) # 3 classes
model = keras.Model(inputs=inputs, outputs=outputs)
return model
#print(model.summary())
unet = unet_model()
unet.compile(optimizer=Adam(learning_rate=0.0005), loss='categorical_crossentropy', metrics=['accuracy']) # Using categorical_crossentropy
earlystop = EarlyStopping(patience=10)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_loss', patience=2, verbose=1, factor=0.5, min_lr=0.00001)
# Learning rate scheduler
def scheduler(epoch, lr):
if epoch < 5:
return lr
else:
return lr * tf.math.exp(-0.1)
lr_schedule_callback = tf.keras.callbacks.LearningRateScheduler(scheduler)
checkpoint = ModelCheckpoint("best_unet_model.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks = [earlystop, learning_rate_reduction, checkpoint, lr_schedule_callback]
history = unet.fit(
train_generator,
epochs=15,
validation_data=validation_generator,
validation_steps=len(X_validate) // BATCH_SIZE,
callbacks=callbacks
)
# Plotting graphs post training
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('U-Net - Loss')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('U-Net - Accuracy')
plt.legend()
plt.tight_layout()
plt.show()
Epoch 1/15 43/43 [==============================] - ETA: 0s - loss: 1.4761 - accuracy: 0.3449 Epoch 1: val_accuracy improved from -inf to 0.33887, saving model to best_unet_model.h5 43/43 [==============================] - 1327s 31s/step - loss: 1.4761 - accuracy: 0.3449 - val_loss: 1.0918 - val_accuracy: 0.3389 - lr: 5.0000e-04 Epoch 2/15 43/43 [==============================] - ETA: 0s - loss: 1.0888 - accuracy: 0.3768 Epoch 2: val_accuracy improved from 0.33887 to 0.39062, saving model to best_unet_model.h5 43/43 [==============================] - 1431s 33s/step - loss: 1.0888 - accuracy: 0.3768 - val_loss: 1.0739 - val_accuracy: 0.3906 - lr: 5.0000e-04 Epoch 3/15 43/43 [==============================] - ETA: 0s - loss: 1.0837 - accuracy: 0.4039 Epoch 3: val_accuracy improved from 0.39062 to 0.47363, saving model to best_unet_model.h5 43/43 [==============================] - 1443s 33s/step - loss: 1.0837 - accuracy: 0.4039 - val_loss: 1.0456 - val_accuracy: 0.4736 - lr: 5.0000e-04 Epoch 4/15 43/43 [==============================] - ETA: 0s - loss: 1.0572 - accuracy: 0.4369 Epoch 4: val_accuracy did not improve from 0.47363 43/43 [==============================] - 1432s 33s/step - loss: 1.0572 - accuracy: 0.4369 - val_loss: 1.0230 - val_accuracy: 0.4619 - lr: 5.0000e-04 Epoch 5/15 43/43 [==============================] - ETA: 0s - loss: 1.0486 - accuracy: 0.4505 Epoch 5: val_accuracy improved from 0.47363 to 0.50000, saving model to best_unet_model.h5 43/43 [==============================] - 1447s 34s/step - loss: 1.0486 - accuracy: 0.4505 - val_loss: 0.9975 - val_accuracy: 0.5000 - lr: 5.0000e-04 Epoch 6/15 43/43 [==============================] - ETA: 0s - loss: 1.0328 - accuracy: 0.4640 Epoch 6: val_accuracy improved from 0.50000 to 0.51367, saving model to best_unet_model.h5 43/43 [==============================] - 1476s 34s/step - loss: 1.0328 - accuracy: 0.4640 - val_loss: 0.9671 - val_accuracy: 0.5137 - lr: 4.5242e-04 Epoch 7/15 32/43 [=====================>........] - ETA: 6:13 - loss: 1.0210 - accuracy: 0.4719
Since our validation and test results were not satisfactory, we are tyring this custom architecture .
The architecture used is a custom deep convolutional neural network that combines elements of U-Net and ResNet architectures.
It starts with an initial convolution, followed by a series of downsampling and residual blocks.
The final layers upscale the output to match the input size.
This combination aims to provide the ability to capture both local features and global context in the images, making it potentially effective for segmentation tasks like identifying pneumonia locations in lung scans
# Loading dependencies
import os
import csv
import random
import pydicom
import numpy as np
import pandas as pd
from skimage import measure
from skimage.transform import resize
import matplotlib.patches as patches
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
%matplotlib inline
2023-10-26 21:41:01.823506: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
pneumonia_locations = {}
# load table
with open(os.path.join('/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_labels.csv'),
'r') as infile:
# open reader
reader = csv.reader(infile)
# skip header
next(reader, None)
# loop through rows
for rows in reader:
# retrieve information
filename = rows[0]
location = rows[1:5]
pneumonia = rows[5]
# if row contains pneumonia add label to dictionary
# which contains a list of pneumonia locations per filename
if pneumonia == '1':
# convert string to float to int
location = [int(float(i)) for i in location]
# save pneumonia location in dictionary
if filename in pneumonia_locations:
pneumonia_locations[filename].append(location)
else:
pneumonia_locations[filename] = [location]
# load and shuffle filenames
folder = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images'
filenames = os.listdir(folder)
random.shuffle(filenames)
# split into train and validation filenames
n_valid_samples = 8000
train_filenames = filenames[n_valid_samples:]
valid_filenames = filenames[:n_valid_samples]
print('n train samples', len(train_filenames))
print('n valid samples', len(valid_filenames))
n_train_samples = len(filenames) - n_valid_samples
n train samples 18684 n valid samples 8000
The dataset is too large to fit into memory, so we need to create a generator that loads data on the fly. The generator takes in some filenames, batch_size and other parameters. The generator outputs a random batch of numpy images and numpy masks.
class generator(keras.utils.Sequence):
def __init__(self, folder, filenames, pneumonia_locations=None, batch_size=32, image_size=256, shuffle=True, augment=False, predict=False):
self.folder = folder
self.filenames = filenames
self.pneumonia_locations = pneumonia_locations
self.batch_size = batch_size
self.image_size = image_size
self.shuffle = shuffle
self.augment = augment
self.predict = predict
self.on_epoch_end()
def __load__(self, filename):
# load dicom file as numpy array
img = pydicom.dcmread(os.path.join(self.folder, filename)).pixel_array
# create empty mask
msk = np.zeros(img.shape)
# get filename without extension
filename = filename.split('.')[0]
# if image contains pneumonia
if filename in self.pneumonia_locations:
# loop through pneumonia
for location in self.pneumonia_locations[filename]:
# add 1's at the location of the pneumonia
x, y, w, h = location
msk[y:y+h, x:x+w] = 1
# resize both image and mask
img = resize(img, (self.image_size, self.image_size), mode='reflect')
msk = resize(msk, (self.image_size, self.image_size), mode='reflect') > 0.5
# if augment then horizontal flip half the time
if self.augment and random.random() > 0.5:
img = np.fliplr(img)
msk = np.fliplr(msk)
# add trailing channel dimension
img = np.expand_dims(img, -1)
msk = np.expand_dims(msk, -1)
return img, msk
def __loadpredict__(self, filename):
# load dicom file as numpy array
img = pydicom.dcmread(os.path.join(self.folder, filename)).pixel_array
# resize image
img = resize(img, (self.image_size, self.image_size), mode='reflect')
# add trailing channel dimension
img = np.expand_dims(img, -1)
return img
def __getitem__(self, index):
# select batch
filenames = self.filenames[index*self.batch_size:(index+1)*self.batch_size]
# predict mode: return images and filenames
if self.predict:
# load files
imgs = [self.__loadpredict__(filename) for filename in filenames]
# create numpy batch
imgs = np.array(imgs)
return imgs, filenames
# train mode: return images and masks
else:
# load files
items = [self.__load__(filename) for filename in filenames]
# unzip images and masks
imgs, msks = zip(*items)
# create numpy batch
imgs = np.array(imgs)
msks = np.array(msks)
return imgs, msks
def on_epoch_end(self):
if self.shuffle:
random.shuffle(self.filenames)
def __len__(self):
if self.predict:
# return everything
return int(np.ceil(len(self.filenames) / self.batch_size))
else:
# return full batches only
return int(len(self.filenames) / self.batch_size)
In summary, this generator class provides an efficient way to load and preprocess batches of medical images (and their annotations) for training or prediction with Keras models. The class is well-suited for tasks like segmentation where we need both images and their corresponding masks.
# define iou or jaccard loss function
def iou_loss(y_true, y_pred):
#print(y_true)
y_true=tf.cast(y_true, tf.float32)
y_pred=tf.cast(y_pred, tf.float32)
y_true = tf.reshape(y_true, [-1])
y_pred = tf.reshape(y_pred, [-1])
intersection = tf.reduce_sum(y_true * y_pred)
score = (intersection + 1.) / (tf.reduce_sum(y_true) + tf.reduce_sum(y_pred) - intersection + 1.)
return 1 - score
# combine bce loss and iou loss
def iou_bce_loss(y_true, y_pred):
return 0.5 * keras.losses.binary_crossentropy(y_true, y_pred) + 0.5 * iou_loss(y_true, y_pred)
# mean iou as a metric
def mean_iou(y_true, y_pred):
y_pred = tf.round(y_pred)
intersect = tf.reduce_sum(y_true * y_pred, axis=[1, 2, 3])
union = tf.reduce_sum(y_true, axis=[1, 2, 3]) + tf.reduce_sum(y_pred, axis=[1, 2, 3])
smooth = tf.ones(tf.shape(intersect))
return tf.reduce_mean((intersect + smooth) / (union - intersect + smooth))
def create_downsample(channels, inputs):
x = keras.layers.BatchNormalization(momentum=0.9)(inputs)
x = keras.layers.LeakyReLU(0)(x)
x = keras.layers.Conv2D(channels, 1, padding='same', use_bias=False)(x)
x = keras.layers.MaxPool2D(2)(x)
return x
def create_resblock(channels, inputs):
x = keras.layers.BatchNormalization(momentum=0.9)(inputs)
x = keras.layers.LeakyReLU(0)(x)
x = keras.layers.Conv2D(channels, 3, padding='same', use_bias=False)(x)
x = keras.layers.BatchNormalization(momentum=0.9)(x)
x = keras.layers.LeakyReLU(0)(x)
x = keras.layers.Conv2D(channels, 3, padding='same', use_bias=False)(x)
return keras.layers.add([x, inputs])
def create_network(input_size, channels, n_blocks=2, depth=4):
# input
inputs = keras.Input(shape=(input_size, input_size, 1))
x = keras.layers.Conv2D(channels, 3, padding='same', use_bias=False)(inputs)
# residual blocks
for d in range(depth):
channels = channels * 2
x = create_downsample(channels, x)
for b in range(n_blocks):
x = create_resblock(channels, x)
# output
x = keras.layers.BatchNormalization(momentum=0.9)(x)
x = keras.layers.LeakyReLU(0)(x)
x = keras.layers.Conv2D(1, 1, activation='sigmoid')(x)
outputs = keras.layers.UpSampling2D(2**depth)(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model
create_downsample):¶Purpose: The main objective of the downsampling block is to reduce the spatial dimensions of the feature maps. This helps the network to increase the receptive field and focus on more abstract and high-level features as we go deeper into the network.
Components:
Batch Normalization: This layer normalizes the activations of the previous layer, meaning that it will make the activations to have zero mean and unit variance. This helps to improve the convergence speed and stability of the network.
Leaky ReLU Activation: Instead of using the traditional ReLU activation function, this block uses Leaky ReLU. The difference is that Leaky ReLU allows a small gradient when the unit is not active, i.e., it doesn't clamp all values below 0 to 0, but lets a small gradient pass through. This can help prevent dead neurons during training.
1x1 Convolution: This is a convolution with a kernel size of 1x1. The primary purpose of 1x1 convolutions is to adjust the number of channels (depth) without changing the spatial dimensions. In this context, it's used to increase the depth to the given channels.
Max-Pooling: A max-pooling layer with a 2x2 filter is used to reduce the spatial dimensions by half. It takes the maximum value from a 2x2 patch of the input data.
create_network):¶Purpose: This function defines the overall structure of the neural network. The architecture is designed for tasks like image segmentation where each pixel of the input image is classified into certain classes.
Components:
Input Layer: Takes in an image of size (input_size, input_size, 1). The 1 indicates that the images are grayscale (single channel).
Initial Convolution: This is a standard convolutional layer with a 3x3 filter size. It's used to extract initial low-level features from the input image. The number of output channels is determined by the channels argument.
Downsampling and Residual Blocks:
depth parameter determines how many times this loop is run, i.e., how many times the spatial dimensions are halved.n_blocks number of residual blocks are added. These blocks help the network learn complex representations without the risk of vanishing gradients, thanks to the shortcut connections in the residual blocks.Final Layers:
Batch Normalization: This normalizes the activations from the previous layer.
Leaky ReLU Activation: A non-linear activation function.
1x1 Convolution with Sigmoid Activation: This layer outputs the final predicted mask. The sigmoid activation ensures that the output values are between 0 and 1, which can be interpreted as probabilities for the segmentation task.
Upsampling: The output from the previous layer might be of reduced spatial dimensions due to the downsampling blocks. The upsampling layer increases the spatial dimensions of the output to match that of the original input image. The factor by which the output is upsampled is 2**depth, restoring the spatial dimensions back to their original size.
Model Creation: The function wraps everything into a Keras Model object with the given inputs and outputs.
The overall architecture, with its combination of downsampling, residual blocks, and upsampling, bears similarities to a U-Net, which is a popular architecture for image segmentation tasks. However, it doesn't have the skip connections that U-Net typically has. Instead, it leverages residual blocks for feature learning.
Intersection Over Union (IOU) Loss (iou_loss):
- This function calculates the IOU loss between the true labels (`y_true`) and the predicted labels (`y_pred`).
- IOU is the ratio of the area of overlap to the area of union between the true and predicted labels. It's commonly used in object detection and segmentation tasks.
- The loss is `1 - IOU`, so a perfect prediction would have an IOU of 1 and a loss of 0.
Combined IOU and Binary Cross-Entropy Loss (iou_bce_loss):
- This function combines the IOU loss (explained above) with the binary cross-entropy (BCE) loss.
- The final loss is the average of the IOU loss and the BCE loss. This combination helps to optimize both the classification and localization aspects in object detection tasks.
Mean IOU Metric (mean_iou):
- This function calculates the mean IOU metric across a batch of images.
- It first rounds off the predicted values to get binary masks.
- Then, it computes the intersection and union of the true and predicted masks.
- The mean IOU is the average ratio of the intersection to union across all images in the batch.
create_downsample):In summary, the code provides a custom architecture for a deep neural network that employs residual blocks and a combination of IOU and BCE loss for training. This architecture is likely designed for segmentation tasks where you want to predict pixel-wise masks for objects in images.
BATCH_SIZE = 128
IMAGE_SIZE = 128
model = create_network(input_size=IMAGE_SIZE, channels=32, n_blocks=2, depth=4)
model.compile(optimizer='adam', loss=iou_bce_loss, metrics=['accuracy', mean_iou])
# cosine learning rate annealing
def cosine_annealing(x):
lr = 0.0001
epochs = 3
return lr*(np.cos(np.pi*x/epochs)+1.)/2
learning_rate = tf.keras.callbacks.LearningRateScheduler(cosine_annealing)
# create train and validation generators
folder = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images'
train_gen = generator(folder, train_filenames, pneumonia_locations, batch_size=BATCH_SIZE,
image_size=IMAGE_SIZE, shuffle=True, augment=False, predict=False)
valid_gen = generator(folder, valid_filenames, pneumonia_locations, batch_size=BATCH_SIZE,
image_size=IMAGE_SIZE, shuffle=False, predict=False)
print(model.summary())
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 128, 128, 1)] 0 []
conv2d (Conv2D) (None, 128, 128, 32) 288 ['input_1[0][0]']
batch_normalization (Batch (None, 128, 128, 32) 128 ['conv2d[0][0]']
Normalization)
leaky_re_lu (LeakyReLU) (None, 128, 128, 32) 0 ['batch_normalization[0][0]']
conv2d_1 (Conv2D) (None, 128, 128, 64) 2048 ['leaky_re_lu[0][0]']
max_pooling2d (MaxPooling2 (None, 64, 64, 64) 0 ['conv2d_1[0][0]']
D)
batch_normalization_1 (Bat (None, 64, 64, 64) 256 ['max_pooling2d[0][0]']
chNormalization)
leaky_re_lu_1 (LeakyReLU) (None, 64, 64, 64) 0 ['batch_normalization_1[0][0]'
]
conv2d_2 (Conv2D) (None, 64, 64, 64) 36864 ['leaky_re_lu_1[0][0]']
batch_normalization_2 (Bat (None, 64, 64, 64) 256 ['conv2d_2[0][0]']
chNormalization)
leaky_re_lu_2 (LeakyReLU) (None, 64, 64, 64) 0 ['batch_normalization_2[0][0]'
]
conv2d_3 (Conv2D) (None, 64, 64, 64) 36864 ['leaky_re_lu_2[0][0]']
add (Add) (None, 64, 64, 64) 0 ['conv2d_3[0][0]',
'max_pooling2d[0][0]']
batch_normalization_3 (Bat (None, 64, 64, 64) 256 ['add[0][0]']
chNormalization)
leaky_re_lu_3 (LeakyReLU) (None, 64, 64, 64) 0 ['batch_normalization_3[0][0]'
]
conv2d_4 (Conv2D) (None, 64, 64, 64) 36864 ['leaky_re_lu_3[0][0]']
batch_normalization_4 (Bat (None, 64, 64, 64) 256 ['conv2d_4[0][0]']
chNormalization)
leaky_re_lu_4 (LeakyReLU) (None, 64, 64, 64) 0 ['batch_normalization_4[0][0]'
]
conv2d_5 (Conv2D) (None, 64, 64, 64) 36864 ['leaky_re_lu_4[0][0]']
add_1 (Add) (None, 64, 64, 64) 0 ['conv2d_5[0][0]',
'add[0][0]']
batch_normalization_5 (Bat (None, 64, 64, 64) 256 ['add_1[0][0]']
chNormalization)
leaky_re_lu_5 (LeakyReLU) (None, 64, 64, 64) 0 ['batch_normalization_5[0][0]'
]
conv2d_6 (Conv2D) (None, 64, 64, 128) 8192 ['leaky_re_lu_5[0][0]']
max_pooling2d_1 (MaxPoolin (None, 32, 32, 128) 0 ['conv2d_6[0][0]']
g2D)
batch_normalization_6 (Bat (None, 32, 32, 128) 512 ['max_pooling2d_1[0][0]']
chNormalization)
leaky_re_lu_6 (LeakyReLU) (None, 32, 32, 128) 0 ['batch_normalization_6[0][0]'
]
conv2d_7 (Conv2D) (None, 32, 32, 128) 147456 ['leaky_re_lu_6[0][0]']
batch_normalization_7 (Bat (None, 32, 32, 128) 512 ['conv2d_7[0][0]']
chNormalization)
leaky_re_lu_7 (LeakyReLU) (None, 32, 32, 128) 0 ['batch_normalization_7[0][0]'
]
conv2d_8 (Conv2D) (None, 32, 32, 128) 147456 ['leaky_re_lu_7[0][0]']
add_2 (Add) (None, 32, 32, 128) 0 ['conv2d_8[0][0]',
'max_pooling2d_1[0][0]']
batch_normalization_8 (Bat (None, 32, 32, 128) 512 ['add_2[0][0]']
chNormalization)
leaky_re_lu_8 (LeakyReLU) (None, 32, 32, 128) 0 ['batch_normalization_8[0][0]'
]
conv2d_9 (Conv2D) (None, 32, 32, 128) 147456 ['leaky_re_lu_8[0][0]']
batch_normalization_9 (Bat (None, 32, 32, 128) 512 ['conv2d_9[0][0]']
chNormalization)
leaky_re_lu_9 (LeakyReLU) (None, 32, 32, 128) 0 ['batch_normalization_9[0][0]'
]
conv2d_10 (Conv2D) (None, 32, 32, 128) 147456 ['leaky_re_lu_9[0][0]']
add_3 (Add) (None, 32, 32, 128) 0 ['conv2d_10[0][0]',
'add_2[0][0]']
batch_normalization_10 (Ba (None, 32, 32, 128) 512 ['add_3[0][0]']
tchNormalization)
leaky_re_lu_10 (LeakyReLU) (None, 32, 32, 128) 0 ['batch_normalization_10[0][0]
']
conv2d_11 (Conv2D) (None, 32, 32, 256) 32768 ['leaky_re_lu_10[0][0]']
max_pooling2d_2 (MaxPoolin (None, 16, 16, 256) 0 ['conv2d_11[0][0]']
g2D)
batch_normalization_11 (Ba (None, 16, 16, 256) 1024 ['max_pooling2d_2[0][0]']
tchNormalization)
leaky_re_lu_11 (LeakyReLU) (None, 16, 16, 256) 0 ['batch_normalization_11[0][0]
']
conv2d_12 (Conv2D) (None, 16, 16, 256) 589824 ['leaky_re_lu_11[0][0]']
batch_normalization_12 (Ba (None, 16, 16, 256) 1024 ['conv2d_12[0][0]']
tchNormalization)
leaky_re_lu_12 (LeakyReLU) (None, 16, 16, 256) 0 ['batch_normalization_12[0][0]
']
conv2d_13 (Conv2D) (None, 16, 16, 256) 589824 ['leaky_re_lu_12[0][0]']
add_4 (Add) (None, 16, 16, 256) 0 ['conv2d_13[0][0]',
'max_pooling2d_2[0][0]']
batch_normalization_13 (Ba (None, 16, 16, 256) 1024 ['add_4[0][0]']
tchNormalization)
leaky_re_lu_13 (LeakyReLU) (None, 16, 16, 256) 0 ['batch_normalization_13[0][0]
']
conv2d_14 (Conv2D) (None, 16, 16, 256) 589824 ['leaky_re_lu_13[0][0]']
batch_normalization_14 (Ba (None, 16, 16, 256) 1024 ['conv2d_14[0][0]']
tchNormalization)
leaky_re_lu_14 (LeakyReLU) (None, 16, 16, 256) 0 ['batch_normalization_14[0][0]
']
conv2d_15 (Conv2D) (None, 16, 16, 256) 589824 ['leaky_re_lu_14[0][0]']
add_5 (Add) (None, 16, 16, 256) 0 ['conv2d_15[0][0]',
'add_4[0][0]']
batch_normalization_15 (Ba (None, 16, 16, 256) 1024 ['add_5[0][0]']
tchNormalization)
leaky_re_lu_15 (LeakyReLU) (None, 16, 16, 256) 0 ['batch_normalization_15[0][0]
']
conv2d_16 (Conv2D) (None, 16, 16, 512) 131072 ['leaky_re_lu_15[0][0]']
max_pooling2d_3 (MaxPoolin (None, 8, 8, 512) 0 ['conv2d_16[0][0]']
g2D)
batch_normalization_16 (Ba (None, 8, 8, 512) 2048 ['max_pooling2d_3[0][0]']
tchNormalization)
leaky_re_lu_16 (LeakyReLU) (None, 8, 8, 512) 0 ['batch_normalization_16[0][0]
']
conv2d_17 (Conv2D) (None, 8, 8, 512) 2359296 ['leaky_re_lu_16[0][0]']
batch_normalization_17 (Ba (None, 8, 8, 512) 2048 ['conv2d_17[0][0]']
tchNormalization)
leaky_re_lu_17 (LeakyReLU) (None, 8, 8, 512) 0 ['batch_normalization_17[0][0]
']
conv2d_18 (Conv2D) (None, 8, 8, 512) 2359296 ['leaky_re_lu_17[0][0]']
add_6 (Add) (None, 8, 8, 512) 0 ['conv2d_18[0][0]',
'max_pooling2d_3[0][0]']
batch_normalization_18 (Ba (None, 8, 8, 512) 2048 ['add_6[0][0]']
tchNormalization)
leaky_re_lu_18 (LeakyReLU) (None, 8, 8, 512) 0 ['batch_normalization_18[0][0]
']
conv2d_19 (Conv2D) (None, 8, 8, 512) 2359296 ['leaky_re_lu_18[0][0]']
batch_normalization_19 (Ba (None, 8, 8, 512) 2048 ['conv2d_19[0][0]']
tchNormalization)
leaky_re_lu_19 (LeakyReLU) (None, 8, 8, 512) 0 ['batch_normalization_19[0][0]
']
conv2d_20 (Conv2D) (None, 8, 8, 512) 2359296 ['leaky_re_lu_19[0][0]']
add_7 (Add) (None, 8, 8, 512) 0 ['conv2d_20[0][0]',
'add_6[0][0]']
batch_normalization_20 (Ba (None, 8, 8, 512) 2048 ['add_7[0][0]']
tchNormalization)
leaky_re_lu_20 (LeakyReLU) (None, 8, 8, 512) 0 ['batch_normalization_20[0][0]
']
conv2d_21 (Conv2D) (None, 8, 8, 1) 513 ['leaky_re_lu_20[0][0]']
up_sampling2d (UpSampling2 (None, 128, 128, 1) 0 ['conv2d_21[0][0]']
D)
==================================================================================================
Total params: 12727969 (48.55 MB)
Trainable params: 12718305 (48.52 MB)
Non-trainable params: 9664 (37.75 KB)
__________________________________________________________________________________________________
None
EPOCHS=3
history = model.fit(train_gen, validation_data=valid_gen, callbacks=[learning_rate], epochs=EPOCHS)
Epoch 1/3 145/145 [==============================] - 3130s 22s/step - loss: 0.5257 - accuracy: 0.9388 - mean_iou: 0.6421 - val_loss: 0.4558 - val_accuracy: 0.9680 - val_mean_iou: 0.7189 - lr: 1.0000e-04 Epoch 2/3 145/145 [==============================] - 3132s 22s/step - loss: 0.4399 - accuracy: 0.9679 - mean_iou: 0.7247 - val_loss: 0.4620 - val_accuracy: 0.9489 - val_mean_iou: 0.6479 - lr: 7.5000e-05 Epoch 3/3 145/145 [==============================] - 3384s 23s/step - loss: 0.4203 - accuracy: 0.9701 - mean_iou: 0.7398 - val_loss: 0.4287 - val_accuracy: 0.9676 - val_mean_iou: 0.7210 - lr: 2.5000e-05
plt.figure(figsize=(12,4))
plt.subplot(131)
plt.plot(history.epoch, history.history["loss"], label="Train loss")
plt.plot(history.epoch, history.history["val_loss"], label="Valid loss")
plt.legend()
plt.subplot(132)
plt.plot(history.epoch, history.history["accuracy"], label="Train accuracy")
plt.plot(history.epoch, history.history["val_accuracy"], label="Valid accuracy")
plt.legend()
plt.subplot(133)
plt.plot(history.epoch, history.history["mean_iou"], label="Train iou")
plt.plot(history.epoch, history.history["val_mean_iou"], label="Valid iou")
plt.legend()
plt.show()
i=0
for imgs, msks in valid_gen:
# predict batch of images
preds = model.predict(imgs)
# create figure
f, axarr = plt.subplots(4, 8, figsize=(20,15))
axarr = axarr.ravel()
axidx = 0
# loop through batch
for img, msk, pred in zip(imgs, msks, preds):
i=i+1
#exit after 32 images
if i>32:
break
# plot image
axarr[axidx].imshow(img[:, :, 0])
# threshold true mask
comp = msk[:, :, 0] > 0.5
# apply connected components
comp = measure.label(comp)
# apply bounding boxes
predictionString = ''
for region in measure.regionprops(comp):
# retrieve x, y, height and width
y, x, y2, x2 = region.bbox
height = y2 - y
width = x2 - x
axarr[axidx].add_patch(patches.Rectangle((x,y),width,height,linewidth=2,
edgecolor='b',facecolor='none'))
# threshold predicted mask
comp = pred[:, :, 0] > 0.5
# apply connected components
comp = measure.label(comp)
# apply bounding boxes
predictionString = ''
for region in measure.regionprops(comp):
# retrieve x, y, height and width
y, x, y2, x2 = region.bbox
height = y2 - y
width = x2 - x
axarr[axidx].add_patch(patches.Rectangle((x,y),width,height,linewidth=2,
edgecolor='r',facecolor='none'))
axidx += 1
plt.show()
# only plot one batch
break
4/4 [==============================] - 3s 638ms/step
We have achieved accuracy of 95-96% with validation loss of 0.4. IOU was higher greater than 65%.
def iou(box1, box2):
x1, y1, w1, h1 = box1
x2, y2, w2, h2 = box2
xi1 = max(x1, x2)
yi1 = max(y1, y2)
xi2 = min(x1 + w1, x2 + w2)
yi2 = min(y1 + h1, y2 + h2)
inter_area = max(xi2 - xi1, 0) * max(yi2 - yi1, 0)
box1_area = w1 * h1
box2_area = w2 * h2
union_area = box1_area + box2_area - inter_area
return inter_area / union_area
def prepare_rpn_labels(X, y_boxes, anchor_sizes=[128, 256, 512], anchor_ratios=[0.5, 1, 2]):
rpn_class = np.zeros((len(X), 7, 7, len(anchor_sizes) * len(anchor_ratios)))
rpn_regr = np.zeros((len(X), 7, 7, 4 * len(anchor_sizes) * len(anchor_ratios)))
# Calculate width and height of the feature map cell
cell_width = 224 / 7
cell_height = 224 / 7
for i in range(len(X)):
for j in range(7):
for k in range(7):
cx = (j + 0.5) * cell_width
cy = (k + 0.5) * cell_height
for idx, size in enumerate(anchor_sizes):
for idy, ratio in enumerate(anchor_ratios):
anchor_idx = idx * len(anchor_ratios) + idy
# Compute the anchor box coordinates based on size, ratio, and position
w_anchor = size * np.sqrt(ratio)
h_anchor = size / np.sqrt(ratio)
x1_anchor = cx - w_anchor / 2
y1_anchor = cy - h_anchor / 2
x2_anchor = cx + w_anchor / 2
y2_anchor = cy + h_anchor / 2
anchor_box = [x1_anchor, y1_anchor, w_anchor, h_anchor]
# Compute overlap with ground truth boxes
overlap = iou(anchor_box, y_boxes[i])
# If overlap > 0.7, anchor box contains an object
if overlap > 0.7:
rpn_class[i, j, k, anchor_idx] = 1
# Compute rpn_regr values based on difference between anchor_box and y_boxes[i]
dx = (y_boxes[i][0] - x1_anchor) / w_anchor
dy = (y_boxes[i][1] - y1_anchor) / h_anchor
dw = np.log(y_boxes[i][2] / w_anchor)
dh = np.log(y_boxes[i][3] / h_anchor)
rpn_regr[i, j, k, 4*anchor_idx:4*anchor_idx+4] = [dx, dy, dw, dh]
elif overlap < 0.3:
rpn_class[i, j, k, anchor_idx] = 0
return rpn_class, rpn_regr
import pydicom
import cv2
import numpy as np
from sklearn.model_selection import train_test_split
def load_and_preprocess_data(df, image_folder):
images = []
bounding_boxes = []
target_labels = []
for idx, row in df.iterrows():
# Load image using pydicom
img_path = os.path.join(image_folder, row['patientId'] + '.dcm')
dicom_data = pydicom.dcmread(img_path)
img = dicom_data.pixel_array
# Convert grayscale to 3-channel image
img = np.stack([img]*3, axis=-1)
# Resize and normalize image
img = cv2.resize(img, (224, 224))
img = img / 255.0
images.append(img)
# Check if bounding box values are NaN
if np.isnan(row['x']):
x_norm, y_norm, w_norm, h_norm = 0, 0, 0, 0
else:
x = row['x']
y = row['y']
w = row['width']
h = row['height']
# Normalize bounding box
x_norm = (x * 224) / 1024
y_norm = (y * 224) / 1024
w_norm = (w * 224) / 1024
h_norm = (h * 224) / 1024
bounding_boxes.append([x_norm, y_norm, w_norm, h_norm])
# Use the Target column as the label
target_labels.append(row['Target'])
return np.array(images), np.array(bounding_boxes), np.array(target_labels)
image_folder = '/Users/amol/Downloads/AIMLProjects/Capstone/GL/Pneumonia Detection/stage_2_train_images'
images, bounding_boxes, target_labels = load_and_preprocess_data(training_data, image_folder)
#X_train, X_test, y_train, y_test = train_test_split(images, bounding_boxes, test_size=0.2, random_state=42)
X_train, X_test, y_train_bbox, y_test_bbox, y_train_labels, y_test_labels = train_test_split(images, bounding_boxes, target_labels, test_size=0.2, random_state=42)
#rpn_class_train, rpn_regr_train = prepare_rpn_labels(X_train, y_train)
rpn_class_train, rpn_regr_train = prepare_rpn_labels(X_train, y_train_bbox)
# One-hot encode the target labels
from keras.utils import to_categorical
y_train_labels_onehot = to_categorical(y_train_labels)
y_test_labels_onehot = to_categorical(y_test_labels)
y_train_bbox.shape
(24181, 4)
y_train_bbox.dtype
dtype('float64')
y_train_bbox
array([[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ],
...,
[127.09375, 87.9375 , 66.71875, 72.625 ],
[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ]])
print("Shape of rpn_class_train:", rpn_class_train.shape)
print("Shape of rpn_regr_train:", rpn_regr_train.shape)
Shape of rpn_class_train: (24181, 7, 7, 9) Shape of rpn_regr_train: (24181, 7, 7, 36)
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import ResNet50
def create_faster_rcnn(input_shape=(224, 224, 3)):
# Backbone: Feature extraction using ResNet50
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=input_shape)
feature_map = base_model.output
# Region Proposal Network (RPN)
rpn = layers.Conv2D(256, (3, 3), activation='relu', padding='same')(feature_map)
rpn_class = layers.Conv2D(9, (1, 1), activation='softmax', name='rpn_class')(rpn) # object or not for 9 anchors
rpn_regr = layers.Conv2D(36, (1, 1), name='rpn_regr')(rpn) # bounding box regressor for 9 anchors
# ROI Pooling (just a placeholder in this example)
roi_input = layers.Input(shape=(None, 4))
x_roi = layers.GlobalAveragePooling2D()(feature_map)
# Classifier and bounding box regressor
x = layers.Flatten()(x_roi)
x = layers.Dense(512, activation='relu')(x)
final_class = layers.Dense(2, activation='softmax', name='final_class')(x) # 2 classes
final_regr = layers.Dense(4, name='final_regr')(x) # bounding box regressor
# Combine outputs into a single model
model = models.Model(inputs=[base_model.input, roi_input], outputs=[rpn_class, rpn_regr, final_class, final_regr])
return model
# Create the model
faster_rcnn_model = create_faster_rcnn()
# Optionally, freeze the layers of the pre-trained model to retain their learned features during training.
for layer in faster_rcnn_model.layers[:-10]: # Adjust as needed
layer.trainable = False
faster_rcnn_model.summary()
2023-10-27 04:31:40.896186: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 ['input_1[0][0]']
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalizati (None, 112, 112, 64) 256 ['conv1_conv[0][0]']
on)
conv1_relu (Activation) (None, 112, 112, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2 (None, 56, 56, 64) 4160 ['pool1_pool[0][0]']
D)
conv2_block1_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_1_conv[0][0]']
rmalization)
conv2_block1_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_1_bn[0][0]']
ation)
conv2_block1_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block1_1_relu[0][0]']
D)
conv2_block1_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_2_conv[0][0]']
rmalization)
conv2_block1_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_2_bn[0][0]']
ation)
conv2_block1_0_conv (Conv2 (None, 56, 56, 256) 16640 ['pool1_pool[0][0]']
D)
conv2_block1_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block1_2_relu[0][0]']
D)
conv2_block1_0_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_0_conv[0][0]']
rmalization)
conv2_block1_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_3_conv[0][0]']
rmalization)
conv2_block1_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activati (None, 56, 56, 256) 0 ['conv2_block1_add[0][0]']
on)
conv2_block2_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block1_out[0][0]']
D)
conv2_block2_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_1_conv[0][0]']
rmalization)
conv2_block2_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_1_bn[0][0]']
ation)
conv2_block2_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block2_1_relu[0][0]']
D)
conv2_block2_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_2_conv[0][0]']
rmalization)
conv2_block2_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_2_bn[0][0]']
ation)
conv2_block2_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block2_2_relu[0][0]']
D)
conv2_block2_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block2_3_conv[0][0]']
rmalization)
conv2_block2_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activati (None, 56, 56, 256) 0 ['conv2_block2_add[0][0]']
on)
conv2_block3_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block2_out[0][0]']
D)
conv2_block3_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_1_conv[0][0]']
rmalization)
conv2_block3_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_1_bn[0][0]']
ation)
conv2_block3_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block3_1_relu[0][0]']
D)
conv2_block3_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_2_conv[0][0]']
rmalization)
conv2_block3_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_2_bn[0][0]']
ation)
conv2_block3_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block3_2_relu[0][0]']
D)
conv2_block3_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block3_3_conv[0][0]']
rmalization)
conv2_block3_add (Add) (None, 56, 56, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activati (None, 56, 56, 256) 0 ['conv2_block3_add[0][0]']
on)
conv3_block1_1_conv (Conv2 (None, 28, 28, 128) 32896 ['conv2_block3_out[0][0]']
D)
conv3_block1_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_1_conv[0][0]']
rmalization)
conv3_block1_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_1_bn[0][0]']
ation)
conv3_block1_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block1_1_relu[0][0]']
D)
conv3_block1_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_2_conv[0][0]']
rmalization)
conv3_block1_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_2_bn[0][0]']
ation)
conv3_block1_0_conv (Conv2 (None, 28, 28, 512) 131584 ['conv2_block3_out[0][0]']
D)
conv3_block1_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block1_2_relu[0][0]']
D)
conv3_block1_0_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_0_conv[0][0]']
rmalization)
conv3_block1_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_3_conv[0][0]']
rmalization)
conv3_block1_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activati (None, 28, 28, 512) 0 ['conv3_block1_add[0][0]']
on)
conv3_block2_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block1_out[0][0]']
D)
conv3_block2_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_1_conv[0][0]']
rmalization)
conv3_block2_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_1_bn[0][0]']
ation)
conv3_block2_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block2_1_relu[0][0]']
D)
conv3_block2_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_2_conv[0][0]']
rmalization)
conv3_block2_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_2_bn[0][0]']
ation)
conv3_block2_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block2_2_relu[0][0]']
D)
conv3_block2_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block2_3_conv[0][0]']
rmalization)
conv3_block2_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activati (None, 28, 28, 512) 0 ['conv3_block2_add[0][0]']
on)
conv3_block3_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block2_out[0][0]']
D)
conv3_block3_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_1_conv[0][0]']
rmalization)
conv3_block3_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_1_bn[0][0]']
ation)
conv3_block3_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block3_1_relu[0][0]']
D)
conv3_block3_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_2_conv[0][0]']
rmalization)
conv3_block3_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_2_bn[0][0]']
ation)
conv3_block3_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block3_2_relu[0][0]']
D)
conv3_block3_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block3_3_conv[0][0]']
rmalization)
conv3_block3_add (Add) (None, 28, 28, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activati (None, 28, 28, 512) 0 ['conv3_block3_add[0][0]']
on)
conv3_block4_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block3_out[0][0]']
D)
conv3_block4_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_1_conv[0][0]']
rmalization)
conv3_block4_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_1_bn[0][0]']
ation)
conv3_block4_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block4_1_relu[0][0]']
D)
conv3_block4_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_2_conv[0][0]']
rmalization)
conv3_block4_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_2_bn[0][0]']
ation)
conv3_block4_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block4_2_relu[0][0]']
D)
conv3_block4_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block4_3_conv[0][0]']
rmalization)
conv3_block4_add (Add) (None, 28, 28, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activati (None, 28, 28, 512) 0 ['conv3_block4_add[0][0]']
on)
conv4_block1_1_conv (Conv2 (None, 14, 14, 256) 131328 ['conv3_block4_out[0][0]']
D)
conv4_block1_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_1_conv[0][0]']
rmalization)
conv4_block1_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_1_bn[0][0]']
ation)
conv4_block1_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block1_1_relu[0][0]']
D)
conv4_block1_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_2_conv[0][0]']
rmalization)
conv4_block1_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_2_bn[0][0]']
ation)
conv4_block1_0_conv (Conv2 (None, 14, 14, 1024) 525312 ['conv3_block4_out[0][0]']
D)
conv4_block1_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block1_2_relu[0][0]']
D)
conv4_block1_0_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_0_conv[0][0]']
rmalization)
conv4_block1_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_3_conv[0][0]']
rmalization)
conv4_block1_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activati (None, 14, 14, 1024) 0 ['conv4_block1_add[0][0]']
on)
conv4_block2_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block1_out[0][0]']
D)
conv4_block2_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_1_conv[0][0]']
rmalization)
conv4_block2_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_1_bn[0][0]']
ation)
conv4_block2_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block2_1_relu[0][0]']
D)
conv4_block2_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_2_conv[0][0]']
rmalization)
conv4_block2_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_2_bn[0][0]']
ation)
conv4_block2_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block2_2_relu[0][0]']
D)
conv4_block2_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block2_3_conv[0][0]']
rmalization)
conv4_block2_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activati (None, 14, 14, 1024) 0 ['conv4_block2_add[0][0]']
on)
conv4_block3_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block2_out[0][0]']
D)
conv4_block3_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_1_conv[0][0]']
rmalization)
conv4_block3_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_1_bn[0][0]']
ation)
conv4_block3_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block3_1_relu[0][0]']
D)
conv4_block3_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_2_conv[0][0]']
rmalization)
conv4_block3_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_2_bn[0][0]']
ation)
conv4_block3_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block3_2_relu[0][0]']
D)
conv4_block3_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block3_3_conv[0][0]']
rmalization)
conv4_block3_add (Add) (None, 14, 14, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activati (None, 14, 14, 1024) 0 ['conv4_block3_add[0][0]']
on)
conv4_block4_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block3_out[0][0]']
D)
conv4_block4_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_1_conv[0][0]']
rmalization)
conv4_block4_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_1_bn[0][0]']
ation)
conv4_block4_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block4_1_relu[0][0]']
D)
conv4_block4_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_2_conv[0][0]']
rmalization)
conv4_block4_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_2_bn[0][0]']
ation)
conv4_block4_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block4_2_relu[0][0]']
D)
conv4_block4_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block4_3_conv[0][0]']
rmalization)
conv4_block4_add (Add) (None, 14, 14, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activati (None, 14, 14, 1024) 0 ['conv4_block4_add[0][0]']
on)
conv4_block5_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block4_out[0][0]']
D)
conv4_block5_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_1_conv[0][0]']
rmalization)
conv4_block5_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_1_bn[0][0]']
ation)
conv4_block5_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block5_1_relu[0][0]']
D)
conv4_block5_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_2_conv[0][0]']
rmalization)
conv4_block5_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_2_bn[0][0]']
ation)
conv4_block5_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block5_2_relu[0][0]']
D)
conv4_block5_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block5_3_conv[0][0]']
rmalization)
conv4_block5_add (Add) (None, 14, 14, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activati (None, 14, 14, 1024) 0 ['conv4_block5_add[0][0]']
on)
conv4_block6_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block5_out[0][0]']
D)
conv4_block6_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_1_conv[0][0]']
rmalization)
conv4_block6_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_1_bn[0][0]']
ation)
conv4_block6_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block6_1_relu[0][0]']
D)
conv4_block6_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_2_conv[0][0]']
rmalization)
conv4_block6_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_2_bn[0][0]']
ation)
conv4_block6_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block6_2_relu[0][0]']
D)
conv4_block6_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block6_3_conv[0][0]']
rmalization)
conv4_block6_add (Add) (None, 14, 14, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activati (None, 14, 14, 1024) 0 ['conv4_block6_add[0][0]']
on)
conv5_block1_1_conv (Conv2 (None, 7, 7, 512) 524800 ['conv4_block6_out[0][0]']
D)
conv5_block1_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_1_conv[0][0]']
rmalization)
conv5_block1_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_1_bn[0][0]']
ation)
conv5_block1_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block1_1_relu[0][0]']
D)
conv5_block1_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_2_conv[0][0]']
rmalization)
conv5_block1_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_2_bn[0][0]']
ation)
conv5_block1_0_conv (Conv2 (None, 7, 7, 2048) 2099200 ['conv4_block6_out[0][0]']
D)
conv5_block1_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
D)
conv5_block1_0_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_0_conv[0][0]']
rmalization)
conv5_block1_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_3_conv[0][0]']
rmalization)
conv5_block1_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activati (None, 7, 7, 2048) 0 ['conv5_block1_add[0][0]']
on)
conv5_block2_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block1_out[0][0]']
D)
conv5_block2_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_1_conv[0][0]']
rmalization)
conv5_block2_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_1_bn[0][0]']
ation)
conv5_block2_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block2_1_relu[0][0]']
D)
conv5_block2_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_2_conv[0][0]']
rmalization)
conv5_block2_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_2_bn[0][0]']
ation)
conv5_block2_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
D)
conv5_block2_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block2_3_conv[0][0]']
rmalization)
conv5_block2_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activati (None, 7, 7, 2048) 0 ['conv5_block2_add[0][0]']
on)
conv5_block3_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block2_out[0][0]']
D)
conv5_block3_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_1_conv[0][0]']
rmalization)
conv5_block3_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_1_bn[0][0]']
ation)
conv5_block3_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block3_1_relu[0][0]']
D)
conv5_block3_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_2_conv[0][0]']
rmalization)
conv5_block3_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_2_bn[0][0]']
ation)
conv5_block3_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
D)
conv5_block3_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block3_3_conv[0][0]']
rmalization)
conv5_block3_add (Add) (None, 7, 7, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activati (None, 7, 7, 2048) 0 ['conv5_block3_add[0][0]']
on)
global_average_pooling2d ( (None, 2048) 0 ['conv5_block3_out[0][0]']
GlobalAveragePooling2D)
flatten (Flatten) (None, 2048) 0 ['global_average_pooling2d[0][
0]']
conv2d (Conv2D) (None, 7, 7, 256) 4718848 ['conv5_block3_out[0][0]']
dense (Dense) (None, 512) 1049088 ['flatten[0][0]']
input_2 (InputLayer) [(None, None, 4)] 0 []
rpn_class (Conv2D) (None, 7, 7, 9) 2313 ['conv2d[0][0]']
rpn_regr (Conv2D) (None, 7, 7, 36) 9252 ['conv2d[0][0]']
final_class (Dense) (None, 2) 1026 ['dense[0][0]']
final_regr (Dense) (None, 4) 2052 ['dense[0][0]']
==================================================================================================
Total params: 29370291 (112.04 MB)
Trainable params: 5782579 (22.06 MB)
Non-trainable params: 23587712 (89.98 MB)
__________________________________________________________________________________________________
Feature Extraction with ResNet50: We're using the ResNet50 model as the feature extraction backbone, which is a common choice for many object detection models. It will produce a feature map that will be used by the subsequent parts of the network.
Region Proposal Network (RPN): We have a simple RPN that outputs two things:
rpn_class: Determines whether an object is present or not.rpn_regr: Predicts the bounding box coordinates.ROI Pooling: We've added a placeholder for ROI pooling using GlobalAveragePooling2D(). In an actual Faster R-CNN implementation, this step involves extracting fixed-size feature maps from the proposed regions, but the placeholder is okay for this simplified design.
Classifier and Bounding Box Regressor: After the ROI pooling step, we have added layers to classify the region and adjust the bounding box coordinates.
Combining Outputs: We're combining all the outputs into a single model.
However, there are a few considerations:
ROI Pooling: The GlobalAveragePooling2D is a placeholder. In a real-world scenario, you'd want a more complex mechanism to handle variable-sized proposals.
Training Data: Training Faster R-CNN requires specialized training data setup where we provide possible proposals (with associated ground truths) to the network.
Loss Function: The training process would require a custom loss function that combines classification loss and bounding box regression loss.
Freezing Layers: We've frozen all layers except the last 10 of the Faster R-CNN model. Depending on how we're training the model, we might want to adjust which layers are frozen or unfrozen.
Output Classes: Ensure the number of classes in final_class matches our data. For a binary problem (Pneumonia or not), we have 2 classes.
Overall, the code provides a skeletal structure for Faster R-CNN. However, training such a model requires a more intricate setup. Implementing Faster R-CNN from scratch can be complex, and it's often more efficient to use existing libraries or frameworks that provide pre-implemented versions.
def iou_loss(y_true, y_pred):
x1_t, y1_t, w_t, h_t = tf.split(y_true, 4, axis=1)
x1_p, y1_p, w_p, h_p = tf.split(y_pred, 4, axis=1)
xi1 = tf.maximum(x1_t, x1_p)
yi1 = tf.maximum(y1_t, y1_p)
xi2 = tf.minimum(x1_t + w_t, x1_p + w_p)
yi2 = tf.minimum(y1_t + h_t, y1_p + h_p)
inter_area = tf.maximum(0.0, xi2 - xi1) * tf.maximum(0.0, yi2 - yi1)
true_area = w_t * h_t
pred_area = w_p * h_p
union_area = (true_area + pred_area) - inter_area
iou = inter_area / (union_area + 1e-10)
return 1.0 - iou
losses = {
'rpn_class': 'binary_crossentropy',
'rpn_regr': 'mean_squared_error',
'final_class': 'categorical_crossentropy',
'final_regr': 'mean_squared_error'
}
loss_weights = {
'rpn_class': 1.,
'rpn_regr': 1.,
'final_class': 1.,
'final_regr': 1.
}
# Compile the model
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
faster_rcnn_model.compile(optimizer=optimizer, loss=losses, loss_weights=loss_weights)
#def data_gen(X, y, batch_size):
# while True:
# for idx in range(0, len(X), batch_size):
# batch_X = X[idx:idx+batch_size]
#
# Batching the labels correctly
# batch_y = y[idx:idx+batch_size]
# Create dummy ROI data
# dummy_roi = np.zeros((batch_size, 4))
# yield [batch_X, dummy_roi], batch_y
def data_gen(X, y, batch_size):
while True:
for idx in range(0, len(X), batch_size):
batch_X = X[idx:idx+batch_size]
# Batching the labels
batch_rpn_class = y[0][idx:idx+batch_size]
batch_rpn_regr = y[1][idx:idx+batch_size]
# Create dummy ROI data
dummy_roi = np.zeros((batch_size, 4))
yield [batch_X, dummy_roi], [batch_rpn_class, batch_rpn_regr]
# Example usage:
batch_size = 32
train_gen = data_gen(X_train, y_train, batch_size)
rpn_class_test, rpn_regr_test = prepare_rpn_labels(X_test, y_test_bbox)
print("Shape of rpn_class_test:", rpn_class_test.shape)
print("Shape of rpn_regr_test:", rpn_regr_test.shape)
Shape of rpn_class_test: (6046, 7, 7, 9) Shape of rpn_regr_test: (6046, 7, 7, 36)
# Organize training labels
y_train = [rpn_class_train, rpn_regr_train, y_train_labels_onehot, y_train_bbox]
# Organize testing labels
y_test = [rpn_class_test, rpn_regr_test, y_test_labels_onehot, y_test_bbox]
tf.config.run_functions_eagerly(True)
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
checkpoint = ModelCheckpoint("faster_rcnn_model.h5", monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=True, mode='min')
early_stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1, mode='min')
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, verbose=1, mode='min', min_delta=0.0001)
callbacks_list = [checkpoint, early_stop, reduce_lr]
# Create dummy ROI data for training and testing. Assuming batch size of 32.
dummy_roi_train = np.zeros((X_train.shape[0], 32, 4))
dummy_roi_test = np.zeros((X_test.shape[0], 32, 4))
tf.config.run_functions_eagerly(True)
history = faster_rcnn_model.fit(
[X_train, dummy_roi_train], y_train,
validation_data=([X_test, dummy_roi_test], y_test),
epochs=5,
batch_size=32,
callbacks=callbacks_list
)
Epoch 1/5 756/756 [==============================] - ETA: 0s - loss: 1546.9259 - rpn_class_loss: 0.0154 - rpn_regr_loss: 0.0086 - final_class_loss: 0.6485 - final_regr_loss: 1546.2535 Epoch 1: val_loss improved from inf to 1527.45154, saving model to faster_rcnn_model.h5 756/756 [==============================] - 5671s 7s/step - loss: 1546.9259 - rpn_class_loss: 0.0154 - rpn_regr_loss: 0.0086 - final_class_loss: 0.6485 - final_regr_loss: 1546.2535 - val_loss: 1527.4515 - val_rpn_class_loss: 3.4049e-04 - val_rpn_regr_loss: 5.5051e-04 - val_final_class_loss: 0.6322 - val_final_regr_loss: 1526.8186 - lr: 1.0000e-04 Epoch 2/5 756/756 [==============================] - ETA: 0s - loss: 1500.7728 - rpn_class_loss: 2.0192e-04 - rpn_regr_loss: 4.6397e-04 - final_class_loss: 0.6234 - final_regr_loss: 1500.1489 Epoch 2: val_loss improved from 1527.45154 to 1523.31555, saving model to faster_rcnn_model.h5 756/756 [==============================] - 5780s 8s/step - loss: 1500.7728 - rpn_class_loss: 2.0192e-04 - rpn_regr_loss: 4.6397e-04 - final_class_loss: 0.6234 - final_regr_loss: 1500.1489 - val_loss: 1523.3156 - val_rpn_class_loss: 1.1778e-04 - val_rpn_regr_loss: 4.5719e-04 - val_final_class_loss: 0.6254 - val_final_regr_loss: 1522.6904 - lr: 1.0000e-04 Epoch 3/5 756/756 [==============================] - ETA: 0s - loss: 1497.0187 - rpn_class_loss: 9.9010e-05 - rpn_regr_loss: 3.9484e-04 - final_class_loss: 0.6229 - final_regr_loss: 1496.3955 Epoch 3: val_loss improved from 1523.31555 to 1520.12622, saving model to faster_rcnn_model.h5 756/756 [==============================] - 5752s 8s/step - loss: 1497.0187 - rpn_class_loss: 9.9010e-05 - rpn_regr_loss: 3.9484e-04 - final_class_loss: 0.6229 - final_regr_loss: 1496.3955 - val_loss: 1520.1262 - val_rpn_class_loss: 7.1695e-05 - val_rpn_regr_loss: 3.2688e-04 - val_final_class_loss: 0.6215 - val_final_regr_loss: 1519.5042 - lr: 1.0000e-04 Epoch 4/5 756/756 [==============================] - ETA: 0s - loss: 1492.3472 - rpn_class_loss: 6.9467e-05 - rpn_regr_loss: 3.5632e-04 - final_class_loss: 0.6203 - final_regr_loss: 1491.7260 Epoch 4: val_loss improved from 1520.12622 to 1513.54065, saving model to faster_rcnn_model.h5 756/756 [==============================] - 5825s 8s/step - loss: 1492.3472 - rpn_class_loss: 6.9467e-05 - rpn_regr_loss: 3.5632e-04 - final_class_loss: 0.6203 - final_regr_loss: 1491.7260 - val_loss: 1513.5406 - val_rpn_class_loss: 5.2849e-05 - val_rpn_regr_loss: 2.6048e-04 - val_final_class_loss: 0.6169 - val_final_regr_loss: 1512.9235 - lr: 1.0000e-04 Epoch 5/5 756/756 [==============================] - ETA: 0s - loss: 1486.9891 - rpn_class_loss: 5.4927e-05 - rpn_regr_loss: 3.2871e-04 - final_class_loss: 0.6151 - final_regr_loss: 1486.3734 Epoch 5: val_loss improved from 1513.54065 to 1507.64136, saving model to faster_rcnn_model.h5 756/756 [==============================] - 5924s 8s/step - loss: 1486.9891 - rpn_class_loss: 5.4927e-05 - rpn_regr_loss: 3.2871e-04 - final_class_loss: 0.6151 - final_regr_loss: 1486.3734 - val_loss: 1507.6414 - val_rpn_class_loss: 4.2115e-05 - val_rpn_regr_loss: 3.6851e-04 - val_final_class_loss: 0.6083 - val_final_regr_loss: 1507.0325 - lr: 1.0000e-04
Training Loss: For each epoch, there's a breakdown of the different loss components (RPN class loss, RPN regression loss, final class loss, and final regression loss), along with the total loss.
final_class_loss is also decreasing, which indicates that the classifier is improving its accuracy over time.The final_regr_loss represents the loss for the bounding box regression, and it has the most significant contribution to the total loss. It's also decreasing, which means the model is getting better at predicting bounding boxes.
Validation Loss: After each epoch, the model is evaluated on the validation set, and the validation loss components are displayed.
val_loss is decreasing with each epoch, which means the model is generalizing well to unseen data and not just memorizing the training set. This is a good sign.val_rpn_class_loss, val_rpn_regr_loss, val_final_class_loss, and val_final_regr_loss) are consistent with the training loss components. Their decrease is a positive sign.Model Saving: The model checkpoints are saved whenever the validation loss improves. This is done to ensure that you always have the best weights available. The message "saving model to faster_rcnn_model.h5" indicates that the model weights were saved.
Training Time: Each epoch seems to take around 95-96 minutes (5750-5780 seconds). Training on a CPU, especially for a complex model like Faster R-CNN, can be very time-consuming. If we have access to a GPU, future training can be done on more epoch and would be faster.
In summary: The model is training as expected. The loss values, both for training and validation, are decreasing, indicating that the model is learning and improving its predictions. However, training on a CPU can be slow, so it's essential to be patient. If we train this model further, it will give very positive results
scores = faster_rcnn_model.evaluate(X_test, y_test)
print("Loss: ", scores[0])
print("Accuracy: ", scores[1])
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[142], line 1 ----> 1 scores = faster_rcnn_model.evaluate(X_test, y_test) 2 print("Loss: ", scores[0]) 3 print("Accuracy: ", scores[1]) File ~/opt/anaconda3/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.__traceback__) 68 # To get the full stack trace, call: 69 # `tf.debugging.disable_traceback_filtering()` ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb File ~/opt/anaconda3/lib/python3.9/site-packages/keras/src/engine/input_spec.py:219, in assert_input_compatibility(input_spec, inputs, layer_name) 213 raise TypeError( 214 f"Inputs to a layer should be tensors. Got '{x}' " 215 f"(of type {type(x)}) as input for layer '{layer_name}'." 216 ) 218 if len(inputs) != len(input_spec): --> 219 raise ValueError( 220 f'Layer "{layer_name}" expects {len(input_spec)} input(s),' 221 f" but it received {len(inputs)} input tensors. " 222 f"Inputs received: {inputs}" 223 ) 224 for input_index, (x, spec) in enumerate(zip(inputs, input_spec)): 225 if spec is None: ValueError: Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor: shape=(32, 224, 224, 3), dtype=float32, numpy= array([[[[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0.25490198, 0.25490198, 0.25490198], [0.15294118, 0.15294118, 0.15294118], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0.18039216, 0.18039216, 0.18039216], [0.1254902 , 0.1254902 , 0.1254902 ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0.11372549, 0.11372549, 0.11372549], [0.08235294, 0.08235294, 0.08235294], [0.00392157, 0.00392157, 0.00392157]], ..., [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0.00392157, 0.00392157, 0.00392157], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]]], [[[0.03921569, 0.03921569, 0.03921569], [0.05098039, 0.05098039, 0.05098039], [0.04705882, 0.04705882, 0.04705882], ..., [0.6117647 , 0.6117647 , 0.6117647 ], [0.62352943, 0.62352943, 0.62352943], [0.6392157 , 0.6392157 , 0.6392157 ]], [[0.03529412, 0.03529412, 0.03529412], [0.04705882, 0.04705882, 0.04705882], [0.04705882, 0.04705882, 0.04705882], ..., [0.6 , 0.6 , 0.6 ], [0.6156863 , 0.6156863 , 0.6156863 ], [0.6392157 , 0.6392157 , 0.6392157 ]], [[0.03921569, 0.03921569, 0.03921569], [0.04705882, 0.04705882, 0.04705882], [0.04705882, 0.04705882, 0.04705882], ..., [0.6 , 0.6 , 0.6 ], [0.6117647 , 0.6117647 , 0.6117647 ], [0.6392157 , 0.6392157 , 0.6392157 ]], ..., [[0.29803923, 0.29803923, 0.29803923], [0.40392157, 0.40392157, 0.40392157], [0.41960785, 0.41960785, 0.41960785], ..., [0.50980395, 0.50980395, 0.50980395], [0.49019608, 0.49019608, 0.49019608], [0.5568628 , 0.5568628 , 0.5568628 ]], [[0.32156864, 0.32156864, 0.32156864], [0.39215687, 0.39215687, 0.39215687], [0.4392157 , 0.4392157 , 0.4392157 ], ..., [0.47843137, 0.47843137, 0.47843137], [0.6 , 0.6 , 0.6 ], [0.52156866, 0.52156866, 0.52156866]], [[0.34117648, 0.34117648, 0.34117648], [0.40784314, 0.40784314, 0.40784314], [0.4392157 , 0.4392157 , 0.4392157 ], ..., [0.49411765, 0.49411765, 0.49411765], [0.54509807, 0.54509807, 0.54509807], [0.45882353, 0.45882353, 0.45882353]]], [[[0.01176471, 0.01176471, 0.01176471], [0.00784314, 0.00784314, 0.00784314], [0.00784314, 0.00784314, 0.00784314], ..., [0.01960784, 0.01960784, 0.01960784], [0.01960784, 0.01960784, 0.01960784], [0.01960784, 0.01960784, 0.01960784]], [[0.00784314, 0.00784314, 0.00784314], [0.00784314, 0.00784314, 0.00784314], [0.01176471, 0.01176471, 0.01176471], ..., [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628]], [[0.01176471, 0.01176471, 0.01176471], [0.02352941, 0.02352941, 0.02352941], [0.03529412, 0.03529412, 0.03529412], ..., [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628]], ..., [[0.8392157 , 0.8392157 , 0.8392157 ], [0.83137256, 0.83137256, 0.83137256], [0.8235294 , 0.8235294 , 0.8235294 ], ..., [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628], [0.01960784, 0.01960784, 0.01960784]], [[0.84313726, 0.84313726, 0.84313726], [0.83137256, 0.83137256, 0.83137256], [0.8235294 , 0.8235294 , 0.8235294 ], ..., [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628]], [[0.85882354, 0.85882354, 0.85882354], [0.84705883, 0.84705883, 0.84705883], [0.827451 , 0.827451 , 0.827451 ], ..., [0.01568628, 0.01568628, 0.01568628], [0.01568628, 0.01568628, 0.01568628], [0.01176471, 0.01176471, 0.01176471]]], ..., [[[0.8156863 , 0.8156863 , 0.8156863 ], [0.76862746, 0.76862746, 0.76862746], [0.5568628 , 0.5568628 , 0.5568628 ], ..., [0.04313726, 0.04313726, 0.04313726], [0.04313726, 0.04313726, 0.04313726], [0.04313726, 0.04313726, 0.04313726]], [[0.7882353 , 0.7882353 , 0.7882353 ], [0.7647059 , 0.7647059 , 0.7647059 ], [0.5529412 , 0.5529412 , 0.5529412 ], ..., [0.03921569, 0.03921569, 0.03921569], [0.03921569, 0.03921569, 0.03921569], [0.04313726, 0.04313726, 0.04313726]], [[0.7882353 , 0.7882353 , 0.7882353 ], [0.77254903, 0.77254903, 0.77254903], [0.5803922 , 0.5803922 , 0.5803922 ], ..., [0.04313726, 0.04313726, 0.04313726], [0.03921569, 0.03921569, 0.03921569], [0.04313726, 0.04313726, 0.04313726]], ..., [[0.7372549 , 0.7372549 , 0.7372549 ], [0.6901961 , 0.6901961 , 0.6901961 ], [0.6745098 , 0.6745098 , 0.6745098 ], ..., [0.74509805, 0.74509805, 0.74509805], [0.7294118 , 0.7294118 , 0.7294118 ], [0.7294118 , 0.7294118 , 0.7294118 ]], [[0.74509805, 0.74509805, 0.74509805], [0.69411767, 0.69411767, 0.69411767], [0.6627451 , 0.6627451 , 0.6627451 ], ..., [0.7411765 , 0.7411765 , 0.7411765 ], [0.72156864, 0.72156864, 0.72156864], [0.7254902 , 0.7254902 , 0.7254902 ]], [[0.7490196 , 0.7490196 , 0.7490196 ], [0.69411767, 0.69411767, 0.69411767], [0.6901961 , 0.6901961 , 0.6901961 ], ..., [0.7411765 , 0.7411765 , 0.7411765 ], [0.7372549 , 0.7372549 , 0.7372549 ], [0.7411765 , 0.7411765 , 0.7411765 ]]], [[[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], ..., [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]], [[0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ], ..., [0. , 0. , 0. ], [0. , 0. , 0. ], [0. , 0. , 0. ]]], [[[0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], ..., [0.02352941, 0.02352941, 0.02352941], [0.09411765, 0.09411765, 0.09411765], [0.28627452, 0.28627452, 0.28627452]], [[0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], ..., [0.02352941, 0.02352941, 0.02352941], [0.09411765, 0.09411765, 0.09411765], [0.2784314 , 0.2784314 , 0.2784314 ]], [[0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], [0.02352941, 0.02352941, 0.02352941], ..., [0.02352941, 0.02352941, 0.02352941], [0.09019608, 0.09019608, 0.09019608], [0.27450982, 0.27450982, 0.27450982]], ..., [[0.22352941, 0.22352941, 0.22352941], [0.1882353 , 0.1882353 , 0.1882353 ], [0.16470589, 0.16470589, 0.16470589], ..., [0.43529412, 0.43529412, 0.43529412], [0.5647059 , 0.5647059 , 0.5647059 ], [0.7137255 , 0.7137255 , 0.7137255 ]], [[0.25882354, 0.25882354, 0.25882354], [0.22745098, 0.22745098, 0.22745098], [0.20392157, 0.20392157, 0.20392157], ..., [0.47843137, 0.47843137, 0.47843137], [0.6117647 , 0.6117647 , 0.6117647 ], [0.74509805, 0.74509805, 0.74509805]], [[0.3019608 , 0.3019608 , 0.3019608 ], [0.2784314 , 0.2784314 , 0.2784314 ], [0.25882354, 0.25882354, 0.25882354], ..., [0.5372549 , 0.5372549 , 0.5372549 ], [0.6627451 , 0.6627451 , 0.6627451 ], [0.7921569 , 0.7921569 , 0.7921569 ]]]], dtype=float32)>]
import matplotlib.pyplot as plt
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss Value')
plt.legend()
plt.show()
faster_rcnn_model.save("faster_rcnn_final_model.h5")
sample_data = training_data.sample(n=1000, random_state=42)
images, bounding_boxes, target_labels = load_and_preprocess_data(sample_data, image_folder)
X_train, X_test, y_train, y_test = train_test_split(images, bounding_boxes, test_size=0.2, random_state=42)
rpn_class_train, rpn_regr_train = prepare_rpn_labels(X_train, y_train)
rpn_class_test, rpn_regr_test = prepare_rpn_labels(X_test, y_test)
y_train = [rpn_class_train, rpn_regr_train, target_labels[:X_train.shape[0]], bounding_boxes[:X_train.shape[0]]]
y_test = [rpn_class_test, rpn_regr_test, target_labels[X_train.shape[0]:], bounding_boxes[X_train.shape[0]:]]
# Ensure that sample_images is a numpy array and has the shape (num_samples, height, width, channels)
sample_images = np.array(sample_images)
# Adjust the dummy ROI input to match the number of samples in sample_images
num_samples = len(sample_images)
dummy_roi_sample = np.zeros((num_samples, 32, 4))
# Now, get model predictions
predictions = faster_rcnn_model.predict([sample_images, dummy_roi_sample])
# Continue with the rest of your code...
1/1 [==============================] - 1s 1s/step
# Note: Don't forget to create dummy ROIs for prediction, similar to training
dummy_roi_sample = np.zeros((20, 32, 4))
# Get model predictions
predictions = faster_rcnn_model.predict([sample_images, dummy_roi_sample])
# Extract predicted class scores and bounding boxes
predicted_scores = predictions[2]
predicted_boxes = predictions[3]
# Convert predicted scores to class labels
predicted_labels = np.argmax(predicted_scores, axis=1)
1/1 [==============================] - 1s 1s/step
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
# 1. Select 20 Random Images
num_samples = 20
random_indices = np.random.choice(X_test.shape[0], num_samples, replace=False)
sample_images = X_test[random_indices]
sample_bboxes = y_test_bbox[random_indices]
sample_labels = y_test_labels[random_indices]
# 2. Predict with the Model
dummy_roi_sample = np.zeros((num_samples, 32, 4))
predictions = faster_rcnn_model.predict([sample_images, dummy_roi_sample])
predicted_bboxes = predictions[3]
predicted_labels = np.argmax(predictions[2], axis=1)
# 3. Draw Bounding Boxes and Display Images
fig, axs = plt.subplots(5, 4, figsize=(16, 20)) # Changed layout to 5x4 for better visualization
for i, ax in enumerate(axs.ravel()):
ax.imshow(sample_images[i])
ax.set_title(f"Predicted: {predicted_labels[i]}\nActual: {sample_labels[i]}")
# Draw actual bounding box if it's not [0., 0., 0., 0.]
if not np.array_equal(sample_bboxes[i], [0., 0., 0., 0.]):
bbox = sample_bboxes[i]
rect = patches.Rectangle((bbox[0], bbox[1]), bbox[2]-bbox[0], bbox[3]-bbox[1], linewidth=2, edgecolor='b', facecolor='none')
ax.add_patch(rect)
# Draw predicted bounding boxes in red
if not np.array_equal(predicted_bboxes[i], [0., 0., 0., 0.]):
bbox = predicted_bboxes[i]
rect = patches.Rectangle((bbox[0], bbox[1]), bbox[2]-bbox[0], bbox[3]-bbox[1], linewidth=2, edgecolor='r', facecolor='none')
ax.add_patch(rect)
ax.axis('off')
plt.tight_layout()
plt.show()
1/1 [==============================] - 1s 1s/step